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(57) Abstract 



Disclosed is a nucleotide sequence encoding an effective portion of a class A starch branching enzyme (SBE) obtainable from potato 
plants, or a functional equivalent thereof, together with, inter alia, a corresponding polypeptide, a method of altering the characteristics of 
a plant, a plant having altered characteristics; and starch, particularly starch obtained from a potato plant, having novel properties. 
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Title: Improvements in or Relating to Plant Starch Composition 

Field of the Invention 

This invention relates to novel nucleotide sequences, polypeptides encoded thereby, vectors 
and host ceils and host organisms comprising one or more of the novel sequences, and to 
a method of altering one or more characteristics of an organism. The invention al;so 
relates to starch having novel properties and to uses thereof. 

Background of the Invention 

Starch is the major form of carbon reserve in plants, constituting 50% or more of the dry 
weight of many storage organs - e.g. tubers, seeds of cereals. Starch is used in numerous 
food and industrial applications. In many cases, however, it is necessary to modify the 
native starches, via chemical or physical means, in order to produce distinct properties to 
suit particular applications. It would be highly desirable to be able to produce starches 
with the required properties direcdy in the plant, thereby removing the need for additional 
modification. To achieve this via genetic engineering requires knowledge of the metabolic 
pathway of starch biosynthesis! This includes characterisation of genes and encoded gene 
products which catalyse the synthesis of starch. Knowledge about the regulation of starch 
biosynthesis raises the possibility of " re-programming M biosynthetic pathways to create 
starches with novel properties that could have new commercial applications. 

The commercially useful properties of starch derive from the ability of the native granular 
form to swell and absorb water upon suitable treatment. Usually heat is required to cause 
granules to swell in a process known as gelatinisation, which has been defined (W A 
Atwell et al, Cereal Foods World 55, 306-311, 1988) as "... the collapse (disruption) of 
molecular orders within the starch granule manifested in irreversible changes in properties 
such as granular swelling, native crystallite melting, loss of birefringence, and starch 
solubilisation. The point of initial gelatinisation and the range over which it occurs is 
governed by starch concentration, method of observation, granule type, and heterogeneities 
within the granule population under observation". A number of techniques are available 
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for the determination of gelatinisation as induced by heating, a convenient and accurate 
method being differential scanning calorimetry, which detects the temperature range and 
enthalpy associated with the collapse of molecular orders within the granule. To obtain 
accurate and meaningful results, the peak and/or onset temperature of the endotherm 
observed by differential scanning calorimetry is usually determined. 

The consequence of the collapse of molecular orders within starch granules is that the 
granules are capable of taking up water in a process known as pasting, which has been 
defined (W A Atwell et al, Cereal Foods World 35, 306-311, 1988) as w ... the 
phenomenon following gelatinisation in the dissolution of starch. It involves granular 
swelling, exudation of molecular components from the granule, and eventually, total 
disruption of the granules". The best method of evaluating pasting properties is 
considered to be the viscoamylograph (Atwell et al, 1988 cited above) in which the 
viscosity of a stirred starch suspension is monitored under a defined time/temperature 
regime. A typical viscoamylograph profile for potato starch shows an initial rise in 
viscosity, which is considered to be due to granule swelling. In addition to the overall 
shape of the viscosity response in a viscoamylograph, a convenient quantitative measure 
is the temperature of initial viscosity development (onset). Figure 1 shows such a typical 
viscosity profile for potato starch, during and after cooking, and includes stages A-D 
which correspond to viscosity onset (A), maximum viscosity (B), complete dispersion (C) 
and reassociation of molecules (or retrogradation, D). In the figure, the dotted line 
represents viscosity (in stirring number units) of a 10% w/w starch suspension and the 
unbroken line shows the temperature in degrees centigrade. At a certain point, defined 
by the viscosity peak, granule swelling is so extensive that the resulting highly expanded 
structures are susceptible to mechanically-induced fragmentation under the stirring 
conditions used. With increased heating and holding at 95 °C, further reduction in 
viscosity is observed due to increased fragmentation of swollen granules. This general 
profile has previously always been found for native potato starch. 

After heating starches in water to 95 °C and holding at that temperature (for typically i5 
minutes), subsequent cooling to 50 °C results in an increase in viscosity due to the process 
of retrogradation or set-back. Retrogradation (or set-back) is defined (Atwell et al., 1988 
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cited above) as "...a process which occurs when the molecules comprising gelatinised 
starch begin to reassociate in an ordered structure...". At 50 °C, it is primarily the 
amy lose component which reassociates, as indicated by the increase in viscoamylograph 
viscosity for starch from normal maize (21.6% amy lose) compared with starch from waxy 
maize (1.1% amy lose) as shown in Figure 2. Figure 2 is a viscoamylograph of 10%w/w 
starch suspensions from waxy maize (solid line), conventional maize (dots and dashes), 
high amy lose variety (hylon 5, dotted line) and a very high amy lose variety (hylon 7, 
crosses). The temperature profile is also shown by a solid line, as in Figure 1. The 
extent of viscosity increase in the viscoamylograph on cooling and holding at 50 °C 
depends on the amount of amylose which is able to reassociate due to its exudation from 
starch granules during the gelatinisation and pasting processes. A characteristic of 
amylose-rich starches from maize plants is that very little amylose is exuded from granules 
by gelatinisation and pasting up to 95 °C, probably due to the restricted swelling of the 
granules. This is illustrated in Figure 2 which shows low viscosities for a high amylose 
(44.9%) starch (Hylon 5) from maize during gelatinisation and pasting at 95 °C and little 
increase in viscosity on cooling and holding at 50°C. This effect is more extreme for a 
higher amylose content (58%, as in Hylon 7), which shows even lower viscosities in the 
viscoamylograph test (Figure 2). For commercially-available high amylose starches 
(currently available from maize plants, such as those described above), processing at 
greater than 100°C is usually necessary in order to generate the benefits of high amylose 
contents with respect to increased rates and strengths of reassociation, but use of such high 
temperatures is energetically unfavourable and costly. Accordingly, there is an unmet 
need for starches of high amylose content which can be processed below 100°C and still 
show enhanced levels of reassociation, as indicated for example by viscoamylograph 
measurements. 

The properties of potato starch are useful in a variety of both food and non-food (paper, 
textiles, adhesives etc.) applications. However, for many applications, properties are not 
optimum and various chemical and physical modifications well known in the an are 
undertaken in order to improve useful properties. Two types of property manipulation 
which would be of use are: the controlled alteration of gelatinisation and pasting 
temperatures; and starches which suffer less granular fragmentation during pasting than 
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conventional starches. 

Currently the only ways of manipulating the gelatinisation and pasting temperatures of 
potato starch are by the inclusion of additives such as sugars, polyhydroxy compounds of 
salts (Evans & Haisman, Starke 34, 224-231. 1982) or by extensive physical or chemical 
pre-treatments (e.g. Stute, Starke 44, 205-214, 1992). The reduction of granule 
fragmentation during pasting can be achieved either by extensive physical pretreatments 
(Stute, Starke 44, 205-214, 1992) or by chemical cross-linking. Such processes are 
inconvenient and inefficient. It is therefore desirable to obtain plants which produce starch 
which intrinsically possesses such advantageous properties. 

Starch consists of two main polysaccharides, amylose and amylopectin. Amylose is a 
generally linear polymer containing a-1,4 linked glucose units, while amylopectin is a 
highly branched polymer consisting of a a-1,4 linked glucan backbone with a-1,6 linked 
glucan branches. In most plant storage reserves amylopectin constitutes about 75% of the 
starch content. Amylopectin is synthesized by the concerted action of soluble starch 
synthase and starch branching enzyme [a-1,4 glucan: a-1,4 glucan 6-glycosyltransferase, 
EC 2.4.1.18]. Starch branching enzyme (SBE) hydrolyses a-1,4 linkages and rejoins the 
cleaved glucan, via an a-1 ,6 linkage, to an acceptor chain to produce a branched structure. 
The physical properties of starch are strongly affected by the relative abundance of 
amylose and amylopectin, and SBE is therefore a crucial enzyme in determining both the 
quantity and quality of starches produced in plant systems. 

In most plants studied to date e.g. maize (Boyer & Preiss, 1978 Biochem. Biophys. Res. 
Comm. 80, 169-175), rice (Smyth, 1988 Plant Sci. 57, 1-8) and pea (Smith, Planta 175, 
270-279), two forms of SBE have been identified, each encoded by a separate gene. A 
recent review by Burton et aL, (1995 The Plant Journal 7, 3-15) has demonstrated that 
the two forms of SBE constitute distinct classes of the enzyme such that, in general, 
enzymes of the same class from different plants may exhibit greater similarity than 
enzymes of different classes from the same plant. In their review, Burton et aL termed 
the two respective enzyme families class "A" and class "B", and the reader is referred 
thereto (and to the references cited therein) for a detailed discussion of the distinctions 
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between the two classes. One general distinction of note would appear to be the presence, 
in class A SBE molecules, of a flexible N-tenninal domain, which is not found in class 
B molecules. The distinctions noted by Burton et al. are relied on herein to define class 
A and class B SBE molecules, which terms are to be interpreted accordingly. 

However in potato, only one isoform of the SBE molecule (belonging to class B) has thus 
far been reported and only one gene cloned (Blennow & Johansson, 1991 Phytochem. 50, 
437-444, and Koflmann era/., 1991 Mol. Gen. Genet. 2J0, 39-44). Further, published 
attempts to modify the properties of starch in potato plants (by preventing expression of 
the single known SBE) have generally not succeeded (e.g. Muller-Rober & KoBmann 1994 
Plant Cell and Environment 77, 601-613). 

Summary of the Invention 

In a first aspect the invention provides a nucleotide sequence encoding an effective portion 
of a class A starch branching enzyme (SBE) obtainable from potato plants. 

Preferably the nucleotide sequence encodes a polypeptide comprising an effective portion 
of the amino acid sequence shown in Figure 5 (excluding the sequence MNKRIDL, which 
does not represent part of the SBE molecule), or a functional equivalent thereof (which 
term is discussed below). The amino acid sequence shown in Figure 5 (Seq ID No. 15) 
includes a leader sequence which directs the polypeptide, when synthesised in potato cells, 
to the amyloplast. Those skilled in the art will recognise that the leader sequence is 
removed to produce a mature enzyme and that the leader sequence is therefore not 
essential for enzyme activity. Accordingly, an "effective portion" of the polypeptide is 
one which possesses sufficient SBE activity to complement the branching enzyme mutation 
in E. coli KV 832 cells (described below) and which is active when expressed in E. coli 
in the phosphorylation stimulation assay. An example of an incomplete polypeptide which 
nevertheless constitutes an "effective portion" is the mature enzyme lacking the leader 
sequence. By analogy with the pea class A SBE sequence, the potato class A sequence 
shown in Figure 5 probably possesses a leader sequence of about 48 amino acid residues, 
such that the N terminal amino acid sequence is thought to commence around the glutamic 
acid residue (E) at position 49 (EKSSYN... etc.). Those skilled in the art will appreciate 
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that an effective portion of the enzyme may well omit other parts of the sequence shown 
in the figure without substantial detrimental effect. For example, the C-terminal glutamic 
acid-rich region could be reduced in length, or possibly deleted entirely, without 
abolishing class A SBE activity. A comparison with other known SBE sequences, 
especially other class A SBE sequences (see for example, Burton et al f 1995 cited above), 
should indicate those portions which are highly conserved (and thus likely to be essential 
for activity) and those portions which are less well conserved (and thus are more likely 
to tolerate sequence changes without substantial loss of enzyme activity). 

Conveniently the nucleotide sequence will comprise substantially nucleotides 289 to 2790 
of the DNA sequence (Seq ID No. 14) shown in Figure 5 (which nucleotides encode the 
mature enzyme) or a functional equivalent thereof, and may also include further 
nucleotides at the 5* or 3* end. For example, for ease of expression, the sequence will 
desirably also comprise an in-frame ATG start codon, and may also encode a leader 
sequence. Thus, in one embodiment, the sequence further comprises nucleotides 145 to 
288 of the sequence shown in Figure 5. Other embodiments are nucleotides 228 to 2855 
of the sequence labelled "psbe2con.seq" in Figure 8, and nucleotides 57 to 2564 of the 
sequence shown in Figure 12 (preferably comprising an in-frame ATG start codon, such 
as the sequence of nucleotides 24 to 56 in the same Figure), or functional equivalents of 
the aforesaid sequences. 

The term "functional equivalent" as applied herein to nucleotide sequences is intended to 
encompass those sequences which differ in their nucleotide composition to that shown in 
Figure 5 but which, by virtue of the degeneracy of the genetic code, encode polypeptides 
having identical or substantially identical amino acid sequences. It is intended that the 
term should also apply to sequences which are sufficiently homologous to the sequence of 
the invention that they can hybridise to the complement thereof under stringent 
hybridisation conditions - such equivalents will preferably possess at least 85%, more 
preferably at least 90%, and most preferably at least 95 % sequence homology with the 
sequence of the invention as exemplified by nucleotides 289 to 2790 of the DNA sequence 
shown in Figure 5. It will be apparent to those skilled in the art that the nucleotide 
sequence of the invention may also find useful application when present as an "antisense" 
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sequence. Accordingly, functionally equivalent sequences will also include those 
sequences which can hybridise, under stringent hybridisation conditions, to the sequence 
of the invention (rather than the complement thereof). Such "antisense" equivalents will 
preferably possess at least 85%, more preferably at least 90%, and most preferably 95% 
sequence homology with the complement of the sequence of the invention as exemplified 
by nucleotides 289 to 2790 of the DNA sequence shown in Figure 5. Particular functional 
equivalents are shown, for example, in Figures 8 and 10 (if one disregards the various 
frameshift mutations noted therein). 

The invention also provides vectors, particularly expression vectors, comprising the 
nucleotide sequence of the invention. The vector will typically comprise a promoter and 
one or more regulatory signals of the type well known to those skilled in the art. The 
invention also includes provision of cells transformed (which term encompasses 
transduction and transfection) with a vector comprising the nucleotide sequence of the 
invention. 

The invention further provides a class A SBE polypeptide, obtainable from potato plants. 
In particular the invention provides the polypeptide in substantially pure form, especially 
in a form free from other plant-derived (especially potato plant-derived) components, 
which can be readily accomplished by expression of the relevant nucleotide sequence in 
a suitable non-plant host (such as any one of the yeast strains routinely used for expression 
purposes, e.g. Pichia spp. or Saccharomyces spp). Typically the enzyme will substantially 
comprise the sequence of amino acid residues 49 to 882 shown in Figure 5 (disregarding 
the sequence MNKRIDL, which is not part of the enzyme), or a functional equivalent 
thereof. The polypeptide of the invention may be used in a method of modifying starch 
in vitro, comprising treating starch under suitable conditions (e.g. appropriate temperature, 
pH, etc) with an effective amount of the polypeptide according to the invention. 

The term "functional equivalent", as applied herein to amino acid sequences, is intended 
to encompass amino acid sequences substantially similar to that shown in Figure 5, such 
that the polypeptide possesses sufficient activity to complement the branching enzyme 
mutation in E. coli KV 832 cells (described below) and which is active in E. coli in the 
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phosphorylation stimulation assay. Typically such functionally equivalent amino acid 
sequences will preferably possess at least 85%, more preferably at least 90%, and most 
preferably at least 95% sequence identity with the amino acid sequence of the mature 
enzyme (i.e. minus leader sequence) shown in Figure 5. Those skilled in the an will 
appreciate that conservative substitutions may be made generally throughout the molecule 
without substantially affecting the activity of the enzyme. Moreover, some non- 
conservative substitutions may be tolerated, especially in the less highly conserved regions 
of the molecule. Such substitutions may be made, for example, to modify slightly the 
activity of the enzyme. The polypeptide may, if desired, include a leader sequence, such 
as that exemplified by residues 1 to 48 of the amino acid sequence shown in Figure 5, 
although other leader sequences and signal peptides and the like are known and may be 
included. 

A portion of the nucleotide sequence of the invention has been introduced into a plant and 
found to affect the characteristics of the plant. In particular, introduction of the sequence 
of the invention, operably linked in the antisense orientation to a suitable promoter, was 
found to reduce the amount of branched starch molecules in the plant. Additionally, it has 
recently been demonstrated in other experimental systems that "sense suppression" can 
also occur (i.e. expression of an introduced sequence operably linked in the sense 
orientation can interfere, by some unknown mechanism, with the expression of the native 
gene), as described by Matzke & Matzke (1995 Plant Physiol. 107, 679-685). Any one 
of the methods mentioned by Matzke & Matzke could, in theory, be used to affect the 
expression in a host of a homologous SBE gene. 

It is believed that antisense methods are mainly operable by the production of antisense 
mRNA which hybridises to the sense mRNA, preventing its translation into functional 
polypeptide, possibly by causing the hybrid RNA to be degraded (e.g. Sheehy etaL, 1988 
PNAS 85, 8805-8809; Van der Krol er aL, Mol. Gen. Genet. 220, 204-212). Sense 
suppression also requires homology between the introduced sequence and the target gene, 
but the exact mechanism is unclear. It is apparent however that, in relation to both 
antisense and sense suppression, neither a full length nucleotide sequence, nor a "native" 
sequence is essential. Preferably the "effective portion" used in the method will comprise 
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at least one third of the full length sequence, but by simple trial and error other fragments 
(smaller or larger) may be found which are functional in altering the characteristics of the 
plant. 

Thus* in a further aspect the invention provides a method of altering the characteristics of 
a plant, comprising introducing into the plant an effective portion of the sequence of the 
invention operably linked to a suitable promoter active in the plant. Conveniently the 
sequence will be linked in the anti-sense orientation to the promoter. Preferably the plant 
is a potato plant. Conveniently, the characteristic altered relates to the starch content 
and/or starch composition of the plant (i.e. amount and/or type of starch present in the 
plant). Preferably the method of altering the characteristics of the plant will also comprise 
the introduction of one or more further sequences, in addition to an effective portion of 
the sequence of the invention. The introduced sequence of the invention and the one or 
more further sequences (which may be sense or antisense sequences) may be operably 
linked to a single promoter (which would ensure both sequences were transcribed at 
essentially the same time), or may be operably linked to separate promoters (which may 
be necessary for optimal expression). Where separate promoters are employed they may 
be identical to each other or different. Suitable promoters are well known to those skilled 
in the art and include both constitutive and inducible types. Examples include the CaMV 
35S promoter (e.g. single or tandem repeat) and the patatin promoter. Advantageously 
the promoter will be tissue-specific. Desirably the promoter will cause expression of the 
operably linked sequence at substantial levels only in the tissue of the plant where starch 
synthesis and/or starch storage mainly occurs. Thus, for example, where the sequence is 
introduced into a potato plant, the operably linked promoter may be tuber-specific, such 
as the patatin promoter. 

Desirably, for example, the method will also comprise the introduction of an effective 
portion of a sequence encoding a class B SBE, operably linked in the antisense orientation 
to a suitable promoter active in the plant. Desirably the further sequence will comprise 
an effective portion of the sequence encoding the potato class B SBE molecule. 
Conveniently the further sequence will comprise an effective portion of the sequence 
described by Blennow & Johansson (1991 Phytochem. 50, 437-444) or that disclosed in 
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W092/11375. More preferably, the further sequence will comprise at least an effective 
portion of the sequence disclosed in International Patent Application No. WO 95/26407. 
Use of antisense sequences against both class A and class B SBE in combination has now 
been found by the present inventors to result in the production of starch having very 
greatly altered properties (see below). Those skilled in the art will appreciate the 
possibility that, if the plant already comprises a sense or antisense sequence which 
efficiently inhibits the class B SBE activity, introduction of a sense or antisense sequence 
to inhibit class A SBE activity (thereby producing a plant with inhibition of both class A 
and class B activity) might alter greatly the properties of the starch in the plant, without 
the need for introduction of one or more further sequences. Thus the sequence of the 
invention is conveniently introduced into plants already having low levels of class A 
and/or class B SBE activity, such that the inhibition resulting from the introduction of the 
sequence of the invention is likely to have a more pronounced effect. 

The sequence of the invention, and the one or more further sequences if desired, can be 
introduced into the plant by any one of a number of well-known techniques (e.g. 
Agrobacterium-mediated transformation, or by "biolistic" methods). The sequences are 
likely to be most effective in inhibiting SBE activity in potato plants, but theoretically 
could be introduced into any plant. Desirable examples include pea, tomato, maize, 
wheat, rice, barley, sweet potato and cassava plants. Preferably the plant will comprise 
a natural gene encoding an SBE molecule which exhibits reasonable homology with the 
introduced nucleic acid sequence of the invention. 

In another aspect, the invention provides a plant cell, or a plant or the progeny thereof, 
which has been altered by the method defined above. The progeny of the altered plant 
may be obtained, for example, by vegetative propagation, or by crossing the altered plant 
and reserving the seed so obtained. The invention also provides parts of the altered plant, 
such as storage organs. Conveniently, for example, the invention provides tubers 
comprising altered starch, said tubers being obtained from an altered plant or the progeny 
thereof. Potato tubers obtained from altered plants (or the progeny thereof) will be 
particularly useful materials in certain industrial applications and for the preparation and/or 
processing of foodstuffs and may be used, for example, to prepare low-fat waffles and 
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chips (amylose generally being used as a coating to prevent fat uptake), and to prepare 
mashed potato (especially " instant " mashed potato) having particular characteristics. 

In particular relation to potato plants, the invention provides a potato plant or pan thereof 
which, in its wild type possesses an effective SBE A gene, but which plant has been 
altered such that there is no effective expression of an SBE A polypeptide within the cells 
of at least part of the plant. The plant may have been altered by the method defined 
above, or may have been selected by conventional breeding to be deleted for the class A 
SBE gene, presence or absence of which can be readily determined by screening samples 
of the plants with a nucleic acid probe or antibody specific for the potato class A gene or 
gene product respectively. 

The invention also provides starch extracted from a plant altered by the method defined 
above, or the progeny of such a plant, the starch having altered properties compared to 
starch extracted from equivalent, but unaltered, plants. The invention further provides a 
method of making altered starch, comprising altering a plant by the method defined above 
and extracting therefrom starch having altered properties compared to starch extracted 
from equivalent, but unaltered, plants. Use of nucleotide sequences in accordance with 
the invention has allowed the present inventors to produce potato starches having a wide 
variety of novel properties. 

In particular the invention provides the following: a plant (especially a potato plant) altered 
by the method defined above, containing starch which, when extracted from the plant, has 
an elevated endotherm peak temperature as judged by DSC, compared to starch extracted 
from a similar, but unaltered, plant; a plant (especially a potato plant) altered by the 
method defined above, containing starch which, when extracted from the plant, has an 
elevated viscosity onset temperature (conveniently elevated by 10 - 25°C) as judged by 
viscoamylograph compared to starch extracted from a similar, but unaltered, plant; a plant 
(especially a potato plant) altered by the method defined above, containing starch which, 
when extracted from the plant, has a decreased peak viscosity (conveniently decreased by 
240 - 700SNUs) as judged by viscoamylograph compared to starch extracted from a 
similar, but unaltered, plant; a plant (especially a potato plant) altered by the method 
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defined above, containing starch which, when extracted from the plant, has an increased 
pasting viscosity (conveniently increased by 37 - 260SNUs) as judged by viscoamylograph 
compared to starch extracted from a similar, but unaltered, plant; a plant (especially a 
potato plant) altered by the method defined above, containing starch which, when extracted 
from the plant, has an increased set-back viscosity (conveniently increased by 224 - 313 
SNUs) as judged by viscoamylograph compared to starch extracted from a similar, but 
unaltered, plant; a plant (especially a potato plant) altered by the method defined above, 
containing starch which, when extracted from the plant, has a decreased set-back viscosity 
as judged by viscoamylograph compared to starch extracted from a similar, but unaltered, 
plant; and a plant (especially a potato plant) altered by the method defined above, 
containing starch which, when extracted from the plant, has an elevated amylose content 
as judged by iodometric assay (i.e. by the method of Morrison & Laignelet 1983, cited 
above) compared to starch extracted from a similar, but unaltered, plant. The invention 
also provides for starch obtainable or obtained from such plants as aforesaid. 

In particular the invention provides for starch which, as extracted from a potato plant by 
wet milling at ambient temperature, has one or more of the following properties, as judged 
by viscoamylograph analysis performed according to the conditions defined below: 
viscosity onset temperature in the range 70-95°C (preferably 75-95°C); peak viscosity in 
the range 500 - 12 stirring number units; pasting viscosity in the range 214 - 434 stirring 
number units; set-back viscosity in the range 450 - 618 or 14 - 192 stirring number units; 
or displays no significant increase in viscosity during viscoamylograph. Peak, pasting and 
set-back viscosities are defined below. Viscosity onset temperature is the temperature at 
which there is a sudden, marked increase in viscosity from baseline levels during 
viscoamylograph, and is a term well-known to those skilled in the an. 

In other particular embodiments, the invention provides starch which as extracted from a 
potato plant by wet milling at ambient temperature has a peak viscosity in the range 200 - 
500 SNUs and a set-back viscosity in the range 275-618 SNUs as judged by 
viscoamylograph according to the protocol defined below; and starch which as extracted 
from a potato plant by wet milling at ambient temperature has a viscosity which does not 
decrease between the start of the heating phase (step 2) and the start of the final holding 
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phase (step 5) and has a set-back viscosity of 303 SNUs or less as judged by 
viscoamylograph according to the protocol defined below. 

For the purposes of the present invention, viscoamylograph conditions are understood to 
pertain to analysis of a 10% (w/w) aqueous suspension of starch at atmospheric pressure, 
using a Newport Scientific Rapid Visco Analyser with a heating profile of: holding at 50°C 
for 2 minutes (step 1), heating from 50 to 95°C at a rate of 1.5°C per minute (step 2), 
holding at 95°C for 15 minutes (step 3), cooling from 95 to 50°C at a rate of 1.5°C per 
minute (step 4), and then holding at 50°C for 15 minutes (step 5). Peak viscosity may be 
defined for present purposes as the maximum viscosity attained during the heating phase 
(step 2) or the holding phase (step 3) of the viscoamylograph. Pasting viscosity may be 
defined as the viscosity attained by the starch suspensions at the end of the holding phase 
(step 3) of the viscoamylograph. Set-back viscosity may be defined as the viscosity of the 
starch suspension at the end of step 5 of the viscoamylograph. 

In yet another aspect the invention provides starch from a potato plant having an apparent 
amy lose content (% w/w) of at least 35%, as judged by iodometric assay according to the 
method described by Morrison & Laignelet (1983 J. Cereal Science i, 9-20). Preferably 
the starch will have an amylose content of at least 40%, more preferably at least 50%, and 
most preferably at least 66%. Starch obtained directly from a potato plant and having 
such properties has not hitherto been produced. Indeed, as a result of the present 
invention, it is now possible to generate in vivo potato starch which has some properties 
analogous to the very high amylose starches (e.g. Hylon 7) obtainable from maize. 

Starches with high (at least 35%) amylose contents find commercial application as, 
amongst other reasons, the amylose component of starch reassociates more strongly and 
rapidly than the amylopectin component during retrogradation processes. This may result, 
for example, in pastes with higher viscosities, gels of greater cohesion, or films of greater 
strength for starches with high (at least 35%) compared with normal (less than 35%) 
amylose contents. Alternatively, starches may be obtained with very high amylose 
contents, such that the granule structure is substantially preserved during heating, resulting 
in starch suspensions which demonstrate substantially no increase in viscosity during 



WO 96/34968 



PCT/GB96/01075 



14 

cooking (i.e. there is no significant viscosity increase during viscoamylograph conditions 
defined above). Such starches typically exhibit a viscosity increase of less than 10% 
(preferably less than 5%) during viscoamylograph under the conditions defined above. 

In commerce, these valuable properties are currently obtained from starches of high 
amylose content derived from maize plants. It would be of commercial value to have an 
alternative source of high amylose starches from potato as other characteristics such as 
granule size, organoleptic properties and textural qualities may distinguish application 
performances of high amylose starches from maize and potato plants. 

Thus high amylose starch obtained by the method of the present invention may find 
application in many different technological fields, which may be broadly categorised into 
two groups: food products and processing; and "Industrial " applications. Under the 
heading of food products, the novel starches of the present invention may find application 
as, for example, films, barriers, coatings or gelling agents. In general, high amylose 
content starches absorb less fat during frying than starches with low amylose content, thus 
the high amylose content starches of the invention may be advantageously used in 
preparing low fat fried products (e.g. potato chips, crisps and the like). The novel 
starches may also be employed with advantage in preparing confectionery and in granular 
and retrograded "resistant" starches. "Resistant" starch is starch which is resistant to 
digestion by a-amylase. As such, resistant starch is not digested by a-amylases present 
in the human small intestine, but passes into the colon where it exhibits properties similar 
to soluble and insoluble dietary fibre. Resistant starch is thus of great benefit in foodstuffs 
due to its low calorific value and its high dietary fibre content. Resistant starch is formed 
by the retrogradation (akin to recrystallization) of amylose from starch gels. Such 
retrogradation is inhibited by amylopectin. Accordingly, the high amylose starches of the 
present invention are excellent starting materials for the preparation of resistant starch. 
Suitable methods for the preparation of resistant starch are well-known to those skilled in 
the art and include, for example, those described in US 5,051,271 and US 5,281,276. 
Conveniently the resistant starches provided, by the present invention comprise at least 5% 
total dietary fibre, as judged by the method of Prosky et a/., (1985 J. Assoc. Off. Anal. 
Chem. 65, 677), mentioned in US 5,281, 276. 



WO 96/34968 



PCT/GB96/01075 



15 

Under the heading of "Industrial" applications, the novel starches of the invention may be 
advantageously employed, for example, in corrugating adhesives, in biodegradable 
products such as loose fill packaging and foamed shapes, and in the production of glass 
fibers and textiles. 

Those skilled in the art will appreciate that the novel starches of the invention may, if 
desired, be subjected in vitro to conventional enzymatic, physical and/or chemical 
modification, such as cross-linking, introduction of hydrophobic groups (e.g. octenyl 
succinic acid, dodecyl succinic acid), or derivatization (e.g. by means of esterification or 
etherification). 

In yet another aspect the invention provides high (35% or more) amylose starches which 
generate paste viscosities greater than those obtained from high amylose starches from 
maize plants after processing at temperatures below 100°C. This provides the advantage 
of more economical starch gelatinisation and pasting treatments through the use of lower 
processing temperatures than are currently required for high amylose starches from maize 
plants. 

The invention will now be further described by way of illustrative example and with 
reference to the drawings, of which: 

i 

Figure 1 shows a typical viscoamylograph for a 10% w/w suspension of potato starch; 

Figure 2 shows vsicoamylographs for 10% suspensions of starch from various maize 
varieties; 

Figure 3 is a schematic representation of the cloning strategy used by the present 
inventors; 

Figure 4a shows the amino acid alignment of the C-terminal portion of starch branching 
enzyme isoforms from various sources: amino acid residues matching the consensus 
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sequence are shaded; 

Figure 4b shows the alignment of DNA sequences of various starch branching enzyme 
isoforms which encode a conserved amino acid sequence; 

Figure 5 shows the DNA sequence (Seq ID No. 14) and predicted amino acid sequence 
(Seq ID No. 15) of a full length potato class A SBE cDNA clone obtained by PCR; 

Figure 6 shows a comparison of the most highly conserved part of the amino acid 
sequences of potato class A (uppermost sequence) and class B (lowermost sequence) SBE 
molecules; 

Figure 7 shows a comparison of the amino acid sequence of the full length potato class A 
(uppermost sequence) and pea (lowermost sequence) class A SBE molecules; 

Figure 8 shows a DNA alignment of various full length potato class A SBE clones 
obtained by the inventors; 

Figure 9 shows the DNA sequence of a potato class A SBE clone determined by direct 
sequencing of PCR products, together with the predicted amino acid sequence; 

Figure 10 is a multiple DNA alignment of various full length potato SBE A clones 
obtained by the inventors; 

Figure 11 is a schematic illustration of the plasmid pSJ64; 

Figure 12 shows the DNA sequence and predicted amino acid sequence of the full length 
potato class A SBE clone as present in the plasmid pSJ90; and 

Figure 13 shows viscoamylographs for 10% w/w suspensions of starch from various 
transgenic potato plants made by the relevant method aspect of the invention. 
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Examples 
Example 1 

Cloning of Potato class A SBE 

The strategy for cloning the second form of starch branching enzyme from potato is shown 
in Figure 3. The small arrowheads represent primers used by the inventors in PCR and 
RACE protocols. The approximate size of the fragments isolated is indicated by the 
numerals on the right of the Figure. By way of explanation, a comparison of the amino 
acid sequences of several cloned plant starch branching enzymes (SBE) from maize (class 
A), pea (class A), maize (class B), rice (class B) and potato (class B) f as well as human 
glycogen branching enzyme, allowed the inventors to identify a region in the 
carboxy-tenninal one third of the protein which is almost completely conserved 
(GYLNFMGNEFGHPEWIDFPR) (Figure 4a). A multiple alignment of the DNA 
sequences (human, pea class A, potato class B, maize class B, maize class A and rice class 
B, respectively) corresponding to this region is shown in Figure 4b and was used to design 
an oligo which would potentially hybridize to all known plant starch branching enzymes: 
AATTT(C/T)ATGGGIAA(C/T)GA(A/G)TT(C/T)GG (Seq ID No. 20). 

Library PCR 

The initial isolation of a partial potato class A SBE cDNA clone was from an amplified 
potato tuber cDNA library in the XZap vector (Stratagene). One half /xL of a potato 
cDNA library (titre 2.3 x 10 9 pfii/mL) was used as template in a 50 /xL reaction containing 
100 pmol of a 16 fold degenerate POTSBE primer and 25 pmol of a 17 primer (present 
in the XZap vector 3* to the cDNA sequences - see Figure 3), 100 fiM dNTPs, 2.5 U Taq 
polymerase and the buffer supplied with the Taq polymerase (Stratagene). All components 
except the enzyme were added to a 0.5 mL microcentrifuge tube, covered with mineral 
oil and incubated at 94°C for 7 minutes and then held at 55 °C, while the Taq polymerase 
was added and mixed by pipetting. PCR was then performed by incubating for 1 min at 
94°C, 1 min at 58°C and 3 minutes at 72°C ? for 35 cycles. The PCR products were 
extracted with phenol/chloroform, ethanol precipitated and resuspended in TE pH 8.0 
before cloning into the T/A cloning vector pT7BlueR (Invitrogen). 
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Several fragments between 600 and 1300 bp were amplified. These were isolated from 
an agarose gel and cloned into the pT7BlueR T/A cloning vector. Restriction mapping of 
24 randomly selected clones showed that they belonged to several different groups (based 
on size and presence/absence of restriction sites). Initially four clones were chosen for 
sequencing. Of these four, two were found to correspond to the known potato class B 
SBE sequence, however the other two, although homologous, differed significantly and 
were more similar to the pea class A SBE sequence, suggesting that they belonged to the 
class A family of branching enzymes (Burton et al. , 1995 The Plant Journal, cited above). 
The latter two clones ( ~ 800bp) were sequenced fully. They both contained at the 5' end 
the sequence corresponding to the degenerate oligonucleotide used in the PCR and had a 
predicted open reading frame of 192 amino acids. The deduced amino acid sequence was 
highly homologous to that of the pea class A SBE. 

The - 800 bp PCR derived cDNA fragment (corresponding to nucleotides 2281 to 3076 
of the psbe2 con.seq sequence shown in Figure 8) was used as a probe to screen the potato 
tuber cDNA library. From one hundred and eighty thousand plaques, seven positives 
were obtained in the primary screen. PCR analysis showed that five of these clones were 
smaller than the original 800 bp cDNA clone, so these were not analysed further. The 
two other clones (designated 3.2.1 and 3.1.1) were approximately 1200 and 1500 bp in 
length respectively. These were sequenced from their 5* ends and the combined consensus 
sequence aligned with the sequence from the PCR generated clones. The cDNA clone 
3.2.1 was excised from the phage vector and plasmid DNA was prepared and the insert 
fully sequenced. Several attempts to obtain longer clones from the library were 
unsuccessful, therefore clones containing the 5' end of the full length gene were obtained 
using RACE (rapid amplification of cDNA ends). 

Rapid Amplification of cDNA ends (RACE) and PCR conditions 
RACE was performed essentially according to Frohman (1992 Amplifications 11-15). 
Two fig of total RNA from mature potato tubers was heated to 65 °C for 5 min and quick 
cooled on ice. The RNA was then reverse transcribed in a 20 reaction for 1 hour at 
37 °C using BRL's M-MLV reverse transcriptase and buffer with 1 mM DTT, 1 mM 
dNTPs, 1 U//xL RNAsin (Promega) and 500 pmol random hexamers (Pharmacia) as 
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primer. Excess primers were removed on a Centricon 100 column and cDNA was 
recovered and precipitated with isopropanol. cDNA was A-tailed in a volume of 20 fil 
using 10 units terminal transferase (BRL), 200 pM dATP for 10 min at 37°C, followed 
by 5 min at 65 °C. The reaction was then diluted to 0.5 ml with TE pH 8 and stored at 
4°C as the cDNA pool. cDNA clones were isolated by PCR amplification using the 
primers RJ*,dT l7 , R„ and POTSBE24. The PCR was performed in 50 pL using a hot start 
technique: 10 pL of the cDNA pool was heated to 94°C in water for 5 min with 25 pmol 
POTSBE24, 25 pmol R„ and 2.5 pmol of RoR,dT l7 and cooled to 75°C. Five pL of 10 
x PCR buffer (Stratagene), 200 fiM dNTPs and 1.25 units of Taq polymerase were added, 
the mixture heated at 45 °C for 2 min and 72°C for 40 min followed by 35 cycles of 94 °C 
for 45 sec, 50°C for 25 sec, 72°C for 1.5 min and a final incubation at 72°C for 10 min. 
PCR products were separated by electrophoresis on 1 % low melting agarose gels and the 
smear covering the range 600-800 bp fragments was excised and used in a second PCR 
amplification with 25 pmol of Rj and POTSBE25 primers in a 50 /xL reaction (28 cycles 
of 94°C for 1 min, 50°C 1 min, 72°C 2 min). Products were purified by chloroform 
extraction and cloned into pT7 Blue. PCR was used to screen the colonies and the longest 
clones were sequenced. 

The first round of RACE only extended the length of the SBE sequence approximately 100 
bases, therefore a new A-tailed cDNA library was constructed using the class A SBE 
specific oligo POTSBE24 (10 pmol) in an attempt to recover longer RACE products. The 
first and second round PCR reactions were performed using new class A SBE primers 
(POTSBE 28 and 29 respectively) derived from the new sequence data. Conditions were 
as before except that the elongation step in the first PCR was for 3 min and the second 
PCR consisted of 28 cycles at 94 °C for 45 seconds, 55 °C for 25 sec and 72 °C for 1 
min 45 sec. 

Clones ranging in size from 400 bp to 1.4 kb were isolated and sequenced. The combined 
sequence of the longest RACE products and cDNA clones predicted a full length gene of 
about 3150 nucleotides, excluding the poly(A) tail (psbe 2con.seq in Fig. 8). 

As the sequence of the 5' half of the gene was compiled from the sequence of several 
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RACE products generated using Taq polymerase, it was possible that the compiled 
sequence did not represent that of a single mRNA species and/or had nucleotide sequence 
changes. The 5" 1600 bases of the gene was therefore re-isolated by PCR using Ultma, 
a thermostable DNA polymerase which, because it possesses a 3'-5' exonuclease activity, 
has a lower error rate compared to Taq polymerase. Several PCR products were cloned 
and restriction mapped and found to differ in the number of Hind HI, Ssp I, and EcoR I 
sites. These differences do not represent PCR artefacts as they were observed in clones 
obtained from independent PCR reactions (data not shown) and indicate that there are 
several forms of the class A SBE gene transcribed in potato tubers. 

In order to ensure that the sequence of the full length cDNA clone was derived from a 
single mRNA species it was therefore necessary to PCR the entire gene in one piece. 
cDNA was prepared according to the RACE protocol except that the adaptor oligo 
R^dTn (5 pmol) was used as a primer and after synthesis the reaction was diluted to 200 
fiL with TE pH 8 and stored at 4°C. Two iiL of the cDNA was used in a PCR reaction 
of 50 /xL using 25 pmol of class A SBE specific primers PBER1 and PBERT (see below), 
and thirty cycles of 94° for 1 min, 60°C for 1 min and 72°C for 3 min. If Taq 
polymerase was used the PCR products were cloned into pT7Blue whereas if Ultma 
polymerase was used the PCR products were purified by chloroform extraction, ethanol 
precipitation and kinased in a volume of 20 /aL (and then cloned into pBSSK IIP which 
had been cut with EcoRV and dephosphorylated). At least four classes of cDNA were 
isolated, which again differed in the presence or absence of Hind m, Ssp I and EcoR I 
sites. Three of these clones were sequenced fully, however one clone could not be 
isolated in sufficient quantity to sequence. 

The sequence of one of the clones (number 19) is shown in Figure 5. The first methionine 
(initiation) codon starts a short open reading frame (ORF) of 7 amino acids which is but 
of frame with the next predicted ORF of 882 amino acids which has a molecular mass 
(Mr) of approximately 100 Kd. Nucleotides 6-2996 correspond to SBE sequence - the rest 
of the sequence shown is vector derived. Figure 6 shows a comparison of the most highly 
conserved part of the amino acid sequence of potato class A SBE (residues 180-871, top, 
row) and potato class B SBE (bottom row, residues 98-792); the middle row indicates the 
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degree of similarity, identical residues being denoted by the common letter, conservative 
changes by two dots and neutral changes by a single dot. Dashes indicate gaps introduced 
to optimise the alignment. The class A SBE protein has 44% identity over the entire 
length with potato class B SBE, and 56% identity therewith in the central conserved 
domain (Figure 6), as judged by the "Megalign" program (DNASTAR). However. Figure 
7 shows a comparison between potato class A SBE (top row, residues 1-873) and pea class 
A SBE (bottom row, residues 1-861), from which it can be observed that cloned potato 
gene is more homologous to the class A pea enzyme, where the identity is 70 % over 
nearly the entire length, and this increases to 83 % over the central conserved region 
(starting at IPPP at position ~170). It is clear from this analysis that this cloned potato 
SBE gene belongs to the class A family of SBE genes. 

An £. coli culture, containing the plasmid pSJ78 (which directs the expression of a full 
length potato SBE Class A gene), has been deposited (on 3rd January 1996) under the 
terms of the Budapest Treaty at The National Collections of Industrial and Marine Bacteria 
Limited (23 St Machar Drive, Aberdeen, AB2 1RY, United Kingdom), under accession 
number NCIMB 40781. Plasmid pSJ78 is equivalent to clone 19 described above. It 
represents a full length SBE A cDNA blunt-end ligated into the vector pBSSKHP. 

Polymorphism of class A SBE genes 

Sequence analysis of the other two full length class A SBE genes showed that they contain 
frameshift mutations and are therefore unable to encode full length proteins and indeed 
they were unable to complement the branching enzyme deficiency in the KV832 mutant 
(described below). An alignment of the full length DNA sequences is shown in Figure 
8: "lOcon.seq" (Seq ID No. 12), "19con.seq n (Seq ID No. 14) and "llcon.seq" (Seq ID 
No. 13) represent the sequence of full length clones 10, 19 and 11 obtained by PCR using 
the PBER1 and PBERT primers (see below), whilst "psbe2con.seq" (Seq ID No. 18) 
represents the consensus sequence of the RACE clones and cDNA clone 3.2.1. Those 
nucleotides which differ from the overall consensus sequence (not shown) are shaded. 
Dashes indicate gaps introduced to optimise the alignment. Apart from the frameshift 
mutations these clones are highly homologous. It should be noted that the 5' sequence of 
psbeZcon is longer because this is the longest RACE product and it also contains several 



WO 96/34968 



PCT/GB96/01075 



22 

changes compared to the other clones. The upstream methionine codon is still present in 
this clone but the upstream ORF is shortened to just 3 amino acids and in addition there 
is a 10 base deletion in the 5' untranslated leader. 

The other significant area of variation is in the carboxy terminal region of the protein 
coding region. Closer examination of this area reveals a GAA trinucleotide repeat 
structure which varies in length between the four clones. These are typical characteristics 
of a microsatellite repeat region. The most divergent clone is #11 which has only one 
GAA triplet whereas clone 19 has eleven perfect repeats and the other two clones have 
five and seven GAA repeats. All of these deletions maintain the ORF but change the 
number of glutamic acid residues at the carboxy terminus of the protein. 

Most of the other differences between the clones are single base changes. It is quite 
possible that some of these are PCR errors. To address this question direct sequencing 
of PCR fragments amplified from first strand cDNA was performed. Figure 9 shows the 
DNA sequence, and predicted amino acid sequence, obtained by such direct sequencing. 
Certain restriction sites are also marked. Nucleotides which could not be unambiguously 
assigned are indicated using standard IUPAC notation and, where this uncertainty affects 
the predicted amino acid sequence, a question mark is used. Sequence at the extreme 5' 
and 3* ends of the gene could not be determined because of the heterogeneity observed in 
the different cloned genes in these regions (see previous paragraph). However this can 
be taken as direct evidence that these differences are real and are not PCR or cloning 
artefacts. 

There is absolutely no evidence for the frameshift mutations in the PCR derived sequence 
and it would appear that these mutations are an artefact of the cloning process, resulting 
from negative selection pressure in E. coli. This is supported by the fact that it proved 
extremely difficult to clone the full length PCR products intact as many large deletions 
were seen and the full length clones obtained were all cloned in one orientation (away 
from the LacZ promoter), perhaps suggesting that expression of the gene is toxic to the 
cells. Difficulties of this nature may have been responsible, at least in pan. for the 
previous failure of other researchers to obtain the present invention. 
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A comparison of all the full length sequences is shown in Figure 10. In addition to clones 
10, 11 and 19 are shown the sequences of a Bgl II r Xho I product cloned directly into the 
QE32 expression vector ( w 86CON.SEQ w , Seq ID No. 16) and the consensus sequence of 
the directly sequenced PCR products ("pcrsbe2con.seq\ Seq ID No. 17). Those 
nucleotides which differ from the consensus sequence (not shown) are shaded. Dashes 
indicate gaps introduced to optimise the alignment. There are 11 nucleotide differences 
predicted to be present in the mRNA population, which are indicated by asterisks above 
and below the sequence. The other differences are probably PCR artefacts or possibly 
sequencing errors. 

Complementation of a branching enzyme deficient E. coli mutant 
To determine if the isolated SBE gene encodes an active protein i.e. one that has 
branching enzyme activity, a complementation test was performed in the E. coli strain 
KV832. This strain is unable to make bacterial glycogen as the gene for the glycogen 
branching enzyme has been deleted (Keil et aL, 1987 Mol. Gen. Genet. 207, 294-301). 
When wild type cells are grown in the presence of glucose they synthesise glycogen (a 
highly branched glucose polymer) which stains a brown colour with iodine, whereas the 
KV832 cells make only a linear chain glucose polymer which stains blueish green with 
iodine. To determine if the cloned SBE gene could restore the ability of the KV832 cells 
to make a branched polymer, the clone pSJ90 (Seq ID No. 19) was used and constructed 
as below. The construct is a PCR-derived, substantially full length fragment (made using 
primers PBE 2B and PBE 2X, detailed below), which was cut with Bgl U and Xho I and 
cloned into the BamH I / Sail sites of the His-tag expression vector pQE32 (Qiagen). 
This clone, pSJ86, was sequenced and found to have a frameshift mutation of two bases 
in the 5' half of the gene. This frameshift was removed by digestion with Nsi I and SnaB 
I and replaced with the corresponding fragment from a Taq-generated PCR clone to 
produce the piasmid pSJ90 (sequence shown in Figure 12; the first 10 amino acids are 
derived from the expression vector). The polypeptide encoded by pSJ90 would be 
predicted to correspond to amino acids 46-882 of the full SBE coding sequence. The 
construct pSJ90 was transformed into the branching enzyme deficient KV832 cells and 
transformants were grown on solid PYG medium (0.85% KH : P0 4 , 1.1% K 2 HP0 4 , 0.6% 
yeast extract) containing 1.0% glucose. To test for complementation, a loop of cells was 
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scraped off and resuspended in 150jxl of water, to which was added 15^1 Lugol's solution 
(2g KI and lg I 2 per 300ml water). It was found that the potato SBE fragment- 
transformed KV832 cells now stained a yellow-brown colour with iodine whereas control 
cells containing only the pQE32 vector continued to stain blue-green. 

Expression of potato class A SBE in E. coli 

Single colonies of KV832, containing one of the plasmids pQE32, pAGCRl or pSJ90, 
were picked into 50ml of 2xYT medium containing carbenicillin, kanamycin and 
streptomycin as appropriate (100, 50 and 25 mg/L, respectively) in a 250ml flask and 
grown for 5 hours, with shaking, at 37°C. HTG was then added to a final concentration 
of ImM to induce expression and the flasks were further incubated overnight at 25°C. 
The cells were harvested by centrifiigation and resuspended in 50 mM sodium phosphate 
buffer (pH 8.0), containing 300mM NaCl, lmg/ml lysozyme and ImM PMSF and left on 
ice for 1 hour. The cell lysates were then sonicated (3 pulses of 10 seconds at 40% power 
using a microprobe) and cleared by centrifiigation at 12,000g for 10 minutes at 4°C. 
Cleared lysates were concentrated approximately 10 fold in a Centricon™ 30 filtration 
unit. Duplicate 10^1 samples of the resulting extract were assayed for SBE activity by the 
phosphorylation stimulation method, as described in International Patent Application No. 
PCT/GB95/00634. In brief, the standard assay reaction mixture (0.2ml) was 200mM 2- 
(N-morpholino) ethanesulphonic acid (MES) buffer pH6.5, containing lOOnCi of l4 C 
glucose-l-phosphate at 50mM, 0.05 mg rabbit phosphorylase A, and E. coli lysate. The 
reaction mixture was incubated for 60 minutes at 30°C and the reaction terminated and 
glucan polymer precipitated by the addition of 1ml of 75% (v/v) methanol, 1% (w/v) 
potassium hydroxide, and then 0.1ml glycogen (lOmg/ml). The results are presented 
below: 



Construct 


SBE Activity (cpm) 


pQE32 (control) 


1,829 


pSJ90 (potato class A SBE) 


14,327 


pAGCRl (pea class A SBE) 


29,707 



The potato class A SBE activity is 7-8 fold above background levels. It was concluded 
therefore that the potato class A SBE gene was able to complement the BE mutation in the 
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phosphorylation stimulation assay and that the cloned gene does indeed code for a protein 
with branching enzyme activity. 

Oligonucleotides 

The following synthetic oligonucleotides (Seq ID No.s 1-11 respectively) were used: 



RoR,dT l7 


AAGGATCCGTCGACATCGATAATACGACTCACTATAGGGA(T) 


Ro 


AAGGATCCGTCGACATC 




GACATCGATAATACGAC 


POTSBE24 


CATCCAACCACCATCTCGCA 


POTSBE25 


TTGAGAGAAGATACCTAAGT 


POTSBE28 


ATGTTCAGTCCATCTAAAGT 


POTSBE29 


AGAACAACAATTCCTAGCTC 


PBER 1 


GGGGCCTTGAACTCAGCAAT 


PBERT 


CGTCCCAGCATTCGACATAA 


PBE2B 


CTTGGATCCTTGAACTCAGCAATTTG 


PBE2X 


TAACTCGAGCAACGCGATCACAAGTTCGT 



Example 2 

Production of Transgenic Plants 

Construction of plant transformation vectors with antisense starch branching enzyme 
genes 

A 1200 bp Sac I - Xho I fragment, encoding approximately the -COOH half of the potato 
class A SBE (isolated from the rescued XZap clone 3.2.1), was cloned into the Sac I - Sal 
I sites of the plant transformation vector pSJ29 to create plasmid pSJ64, which is 
illustrated schematically in Figure 11. In the figure, the black line represents the DNA 
sequence. The broken line represents the bacterial plasmid backbone (containing the 
origin of replication and bacterial selection marker), which is not shown in full. The filled 
triangles on the line denote the T-DNA borders (RB = right border, LB = left border). 
Relevant restriction sites are shown above the black line, with the approximate distances 
(in kilobases) between the sites (marked by an asterisk) given by the numerals below the 
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line. The thinnest arrows indicate polyadenylation signals (pAnos = nopaline synthase, 
pAg7 = Agrobacterium gene 7), the arrows intermediate in thickness denote protein 
coding regions (SBE II = potato class A SBE, HYG = hygromycin resistance gene) and 
the thickest arrows represent promoter regions (P-2x35 = double CaMV 35S promoter, 
Pnos = nopaline synthase promoter). Thus pSJ64 contained the class A SBE gene 
fragment in an antisense orientation between the 2X 35S CaMV promoter and the nopaline 
synthase polyadenylation signal. 

For information, pSJ29 is a derivative of the binary vector pGPTV-HYG (Becker et al. 9 
1992 Plant Molecular Biology 20, 1195-1197) modified as follows: an approximately 750 
bp (Sac I, T4 DNA polymerase blunted - Sal I) fragment of pJIT60 (Guerineau et al., 
1992 Plant Mol. Biol. 18, 815-818) containing the duplicated cauliflower mosaic virus 
(CaMV) 35S promoter (Cabb-JI strain, equivalent to nucleotides 7040 to 7376 duplicated 
upstream of 7040 to 7433, Frank et al., 1980 Cell 21, 285-294) was cloned into the Hind 
m (Klenow polymerase repaired) - Sal I sites of pGPTV-HYG to create pSJ29. 

Plant transformation 

Transformation was conducted on two types of potato plant explants; either wild type 
untransfonned minitubers (in order to give single transfonnants containing the class A 
antisense construct alone) or minitubers from three tissue culture lines (which gave rise 
to plants #12, #15, #17 and #18 indicated in Table 1) which had already been successfully 
transformed with the class B (SBE I) antisense construct containing the tandem 35S 
promoter (so as to obtain double transformant plants, containing antisense sequences for 
both the class A and class B enzymes). 

Details of the method of Agrobacterium transformation, and of the growth of transformed 
plants, are described in International Patent Application No. WO 95/26407, except that 
the medium used contained 3 % sucrose (not 1 %) until the final transfer and that the initial 
incubation with Agrobacterium (strain 3850) was performed in darkness. Transfonnants 
containing the class A antisense sequence were selected by growth in medium containing 
15mg/L hygromycin (the class A antisense construct comprising the HYG gene, i.e. 
hygromycin phosphotransferase). 
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Transformation was confirmed in all cases by production of a DNA fragment from the 
antisense gene after PCR in the presence of appropriate primers and a crude extract of 
genomic DNA from each regenerated shoot. 

Characterisation of starch from potato plants 

Starch was extracted from plants as follows: potato tubers were homogenised in water for 
2 minutes in a Waring blender operating at high speed. The homogenate was washed and 
filtered (initially through 2mm, then through 1mm filters) using about 4 litres of water per 
lOOgms of tubers (6 extractions). Washed starch granules were finally extracted with 
acetone and air dried. 

Starch extracted from singly transformed potato plants (class A/SBE II antisense, or class 
B/SBE I antisense), or from double transformants (class A/SBE II and class B/SBE I 
antisense), or from untransformed control plants, was partially characterised. The results 
are shown in Table 1. The table shows the amount of SBE activity (units/gram tissue) in 
tubers from each transformed plant. The endotherm peak temperature (°C) of starch 
extracted from several plants was determined by DSC, and the onset temperature (°C) of 
pasting was determined by reference to a viscoamylograph ("RVA"), as described in WO 
95/26407. The viscoamylograph profile was as follows: step 1 - 50°C for 2 minutes; step 
2 - increase in temperature from 50°C to 95°C at a rate of 1.5°C per minute; step 3 - 
holding at 95°C for 15 minutes; step 4 - cooling from 95°C to 50°C at a rate of 1.5°C per 
minute; and finally, step 5 - holding at 50°C for 15 minutes. Table 1 shows the peak, 
pasting and set-back viscosities in stirring number units (SNUs), which is a measure of 
the amount of torque required to stir the suspensions. Peak viscosity may be defined for 
present purposes as the maximun viscosity attained during the heating phase (step 2) or 
the holding phase (step 3) of the viscoamylograph. Pasting viscosity may be defined as 
the viscosity attained by the starch suspensions at the end of the holding phase (step 3) of 
the viscoamylograph. Set-back viscosity may be defined as the viscosity of the starch 
suspension at the end of step 5 of the viscoamylograph. 

A determination of apparent amy lose content (% w/w) was also performed, using the 
iodometric assay method of Morrison & Laignelet (1983 J. Cereal Sci. i, 9-20). The 
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results (percentage apparent amy lose) are shown in Table 1. The untransformed and 
transformed control plants gave rise to starches having apparent amylose contents in the 
range 29(+/-3)%. 

Generally similar values for amylose content were obtained for starch extracted from most 
of the singly transformed plants containing the class A (SBE II) antisense sequence. 
However, some plants (#152, 249) gave rise to starch having an apparent amylose content 
of 37-38%, notably higher than the control value. Starch extracted from these plants had 
markedly elevated pasting onset temperatures, and starch from plant 152 also exhibited an 
elevated endotherm peak temperature (starch from plant 249 was not tested by DSC). 
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It should be noted that, even if other single transformants were not to provide starch with 
an altered amylose/amylopectin ratio, the starch from such plants might still have different 
properties relative to starch from conventional plants (e.g. different average molecular 
weight or different amylopectin branching patterns), which might be useful. 

Double transformant plants, containing antisense sequences for both the class A and class 
B enzymes, had greatly reduced SBE activity (units/gm) compared to untransfonned plants 
or single anti-sense class A transformants, (as shown in Table 1). Moreover, certain of 
the double transformant plants contained starch having very significantly altered 
properties. For example, starch extracted from plants #201, 202, 208, 208a, 236 and 
236a had drastically altered amylose/amylopectin ratios, to the extent that amylose was the 
main constituent of starch from these plants. The pasting onset temperatures of starch 
from these plants were also the most greatly increased (by about 25-30°C). Starch from 
plants such as #150, 161, 212, 220 and 230a represented a range of intermediates, in that 
such starch displayed a more modest rise in both amylose content and pasting onset 
temperature. The results would tend to suggest that there is generally a correlation 
between % amylose content and pasting onset temperature, which is in agreement with the 
known behaviour of starches from other sources, notably maize. 

The marked increase in amylose content obtained by inhibition of class A SBE alone, 
compared to inhibition of class B SBE alone (see PCT/GB95/00634) might suggest that 
it would be advantageous to transform plants first with a construct to suppress class A SBE 
expression (probably, in practice, an antisense construct), select those plants giving rise 
to starch with the most altered properties, and then to re-transform with a construct to 
suppress class B SBE expression (again, in practice, probably an antisense construct), so 
as to maximise the degree of starch modification. 

In addition to pasting onset temperatures, other features of the viscoamylograph profile 
e.g. for starches from plants #149, 150, 152, 161, 201, 236 and 236a showed significant 
differences to starches from control plants, as illustrated in Figure 13. Referring to Fieure 
13, a number of viscoamylograph traces are shown. The legend is as follows: shaded box 
- normal potato starch control (29.8% amylose content); shaded circle - starch from plant 
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149 (35.6% amylose): shaded triangle, pointing upwards - plant 152 (37.5%); shaded 
triangle, pointing downwards - plant 161 (40.9%); shaded diamond - plant 150 (53.1%); 
unshaded box - plant 236a (56.7%); unshaded circle - plant 236 (60.4%); unshaded 
triangle, pointing upwards - plant 201 (66.4%); unshaded triangle, pointing downwards - 
Hylon V starch, from maize (44.9 % amylose). The thin line denotes the heating profile. 

With increasing amylose content, peak viscosities during processing to 95°C decrease, and 
the drop in viscosity from the peak until the end of the holding period at 95°C also 
generally decreases (indeed, for some of the starch samples there is an increase in 
viscosity during this period). Both of these results are indicative of reduced granule 
fragmentation, and hence increased granule stability during pasting. This property has not 
previously been available in potato starch without extensive prior chemical or physical 
modification. For applications where a maximal viscosity after processing to 95 °C is 
desirable (i.e. corresponding to the viscosity after 47 minutes in the viscoamylograph test), 
starch from plant #152 would be selected as starches with both lower (Controls, #149) and 
higher (#161, #150) amylose contents have lower viscosities following this gelatinisation 
and pasting regime (Figure 13 and Table 1). It is believed that the viscosity at this stage 
is determined by a combination of the extent of granule swelling and the resistance of 
swollen granules to mechanical fragmentation. For any desired viscosity behaviour, one 
skilled in the art would select a potato starch from a range containing different amylose 
contents produced according to the invention by performing suitable standard viscosity 
tests. 

Upon cooling pastes from 95°C to 50°C, potato starches from most plants transformed 
in accordance with the invention showed an increase in viscoamylograph viscosity as 
expected for partial reassociation of amylose. Starches from plants #149, 152 and 161 all 
show viscosities at 50°C significantly in excess of those for starches from control plants 
(Figure 13 and Table 1). This contrasts with the effect of elevated amylose contents in 
starches from maize plants (Figure 2) which show very low viscosities throughout the 
viscoamylograph test. Of particular note is the fact that, for similar amylose contents, 
starch from potato plant 150 (53% amylose) shows markedly increased viscosity compared 
with Hylon 5 starch (44.9% amylose) as illustrated in Figure 13. This demonstrates that 
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useful properties which require elevated (35 % or greater) amy lose levels can be obtained 
by processing starches from potato plants below 100°C, whereas more energy-intensive 
processing is required in order to generate similarly useful properties from high amyiose 
starches derived from maize plants. 

Final viscosity in the viscoamylograph test (set-back viscosity after 92 minutes) is greatest 
for starch from plant #161 (40.9% amyiose) amongst those tested (Figure 13 and Table 
1). Decreasing final viscosities are obtained for starches from plant #152 (37.5% 
amyiose), #149 (35.6% amyiose) and #150 (53.1% amyiose). Set-back viscosity occurs 
where amyiose molecules, exuded from the starch granule during pasting, start to re- 
associate outside the granule and form a viscous gel-like substance. It is believed that the 
set-back viscosity values of starches from transgenic potato plants represent a balance 
between the inherent amyiose content of the starches and the ability of the amyiose 
fraction to be exuded from the granule during pasting and therefore be available for the 
reassociation process which results in viscosity increase. For starches with low amyiose 
content, increasing the amyiose. content tends to make more amyiose available for re- 
association, thus increasing the set-back viscosity. However, above a threshold value, 
increased amyiose content is thought to inhibit granule swelling, thus preventing exudation 
of amyiose from the starch granule and reducing the amount of amyiose available for re- 
association. This is supported by the RVA results obtained for the very high amyiose 
content potato starches seen in the viscoamylograph profiles in Figure 13. For any 
desired viscosity behaviour following set-back or retrogradation to any desired temperature 
over any desired timescale, one skilled in the art would select a potato starch from a range 
containing different amyiose contents produced according to the invention by performing 
standard viscosity tests. 

Further experiments with starch from plants #201 and 208 showed that this had an 
apparent amyiose content of over 62% (see Table 1). Viscoamylograph studies showed 
that starch from these plants had radically altered properties and behaved in a manner 
similar to hylon 5 starch from maize plants (Figure 13). Under the conditions employed 
in the viscoamylograph, this starch exhibited extremely limited (nearly undetectable) 
granule swelling. Thus, for example, unlike starch from control plants, starch from plants 
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201, 208 and 208a did not display a clearly defined pasting viscosity peak during the 
heating phase. Microscopic analysis confirmed that the starch granule structure underwent 
only minor swelling during the experimental heating process. This property may well be 
particularly useful in certain applications, as will be apparent to those skilled in the an. 

Some re-grown plants have so far been found to increase still further the apparent amylose 
content of starch extracted therefrom. Such increases may be due to:- 

i) Growth and development of the first generation transformed plants may have been 
affected to some degree by the exogenous growth hormones present in the tissue culture 
system, which exogenoous hormones were not present during growth of the second 
generation plants; and 

ii) Subsequent generations were grown under field conditions, which may allow for 
attainment of greater maturity than growth under laboratory conditions, it being generally 
held that amylose content of potato starch increases with maturity of the potato tuber. 
Accordingly, it should be possible to obtain potato plants giving rise to tubers with starch 
having an amylose content in excess of the 66% level so far attained, simply by analysing 
a greater number of transformed plants and/or by re-growing transgenic plants through one 
or more generations under field conditions. 

Table 1 shows that another characteristic of starch which is affected by the presence of 
anti-sense sequences to SBE is the phosphorus content. Starch from untransformed control 
plants had a phosphorus content of about 60-70mg/100gram dry weight (as determined 
according to the AOAC Official Methods of Analysis, 15th Edition, Method 948.09 
"Phosphorus in Flour"). Introduction into the plant of an anti-sense SBE B sequence was 
found to cause a modest increase (about two-fold) in phosphorus content, which is in 
agreement with the previous findings reported at scientific meetings. Similarly, anti-sense 
to SBE A alone causes only a small rise in phosphorus content relative to untransformed 
controls. However, use of anti-sense to both SBE A and B in combination results in up 
to a four-fold increase in phosphorus content, which is far greater than any in plama 
phosphorus content previously demonstrated for potato starch. 

This is useful in that, for certain applications, starch must be phosphorylated in vitro by 
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chemical modification. The ability to obtain potato starch which, as extracted from the 
plant, already has a high phosphorus content will reduce the amount of in vitro 
phosphorylation required suitably to modify the starch. Thus, in another aspect the 
invention provides potato starch which, as extracted from the plant, has a phosphorus 
content in excess of 200mg/100gram dry weight starch. Typically the starch will have a 
phosphorus content in the range 200 - 240mg/100gram dry weight starch. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: 

(A) NAME: National Starch and Chemical Investment 

Holding Corporation 

(B) STREET: 501 Silverside Road. Suite 27 

(C) CITY: Wilmington 

(D) STATE: Delaware 

(E) COUNTRY: United States of America 

(F) POSTAL CODE (ZIP): 19809 

(ii) TITLE OF INVENTION: Improvements in or Relating to Plant Starch 
Composition 

(iii) NUMBER OF SEQUENCES: 20 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0. Version #1.30 (EPO) 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
AAGGATCCGT CGACATCGAT AATACGACTC ACTATAGGGA l lllll l lll TTTTTTT 57 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

AAGGATCCGT CGACATC 17 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: .17 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GACATCGATA ATACGAC 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
CATCCAACCA CCATCTCGCA 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 5: 
TTGAGAGAAG ATACCTAAGT 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
ATGTTCAGTC CATCTAAAGT 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
AGAACAACAA TTCCTAGCTC 20 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GGGGCCTTGA ACTCAGCAAT 20 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CGTCCCAGCA TTCGACATAA 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CTTGGATCCT TGAACTCAGC AATTTG 26 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
TAACTCGAGC AACGCGATCA CAAGTTCGT 29 
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(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3003 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GATGGGGCCT TGAACTCAGC AATTTGACAC TCAGTTAGTT ACACTGCCAT CACTTATCAG 60 

ATCTCTATTT TTTCTCTTAA TTCCAACCAA GGAATGAATA AAAAGATAGA TTTGTAAAAA 120 

CCCTAAGGAG AGAAGAAGAA AGATGGTGTA TACACTCTCT GGAGTTCGTT TTCCTACTGT 180 

TCCATCAGTG TACAAATCTA ATGGATTCAG CAGTAATGGT GATCGGAGGA ATGCTAATAT 240 

TTCTGTATTC TTGAAAAAAC ACTCTCTTTC ACGGAAGATC TTGGCTGAAA AGTCTTCTTA 300 

CAATTCCGAA TCCCGACCTT CTACAATTGC AGCATCGGGG AAAGTCCTTG TGCCTGGAAT . 360 

CCAGAGTGAT AGCTCCTCAT CCTCAACAGA TCAATTTGAG TTCGCTGAGA CATCTCCAGA 420 

AAATTCCCCA GCATCAACTG ATGTAGATAG TTCAACAATG GAACACGCTA GCCAGATTAA 480 

AACTGAGAAC GATGACGTTG AGCCGTCAAG TGATCTTACA GGAAGTGTTG AAGAGCTGGA 540 

TTTTGCTTCA TCACTACAAC TACAAGAAGG TGGTAAACTG GAGGAGTCTA AAACATTAAA 600 

TACTTCTGAA GAGACAATTA TTGATGAATC TGATAGGATC AGAGAGAGGG GCATCCCTCC 660 

ACCTGGACTT GGTCAGAAGA TTTATGAAAT AGACCCCCTT TTGACAAACT ATCGTCAACA 720 

CCTTGATTAC AGGTATTCAC AGTACAAGAA ACTGAGGGAG GCAATTGACA AGTAT6AGGG 780 

TGGTTTGGAA GCTTTTTCTC GTGGTTATGA AAGAATGGGT TTCACTCGTA GTGCTACAGG 840 

TATCACTTAC CGTGAGTGGG CTCCTGGTGC CCAGTCAGCT GCCCTCATTG GGGATTTCAA 900 

CAATTGGGAC GCAAATGCTG ACTTTATGAC TCGGAATGAA TTTGGTGTCT GAGAGATTTT 960 

TCTGCCAAAT AATGTGGATG GTTCTCCTGC AATTCCTCAT GGGTCCAGAG TGAAGATACG 1020 

TATGGACACT CCATCAGGTG TTAAGGATTC CATTCCTGCT TGGATCAACT ACTCTTTACA 1080 

GCTTCCTGAT GAAATTCCAT ATAATGGAAT ATATTATGAT CCACCCGAAG AGGAGAGGTA 1140 

TATCTTCCAA CACCCACGGC CAAAGAAACC AAAGTCGGTG AGAATATATG AATCTCATAT 1200 

TGGAATGAGT AGTCCGGAGC CTAAAATTAA CTCATACGTG AATTTTAGAG ATGAAGTTCT 1260 

TCCTCGCATA AAAAAAGCTT GGGTACAATG CGGTGCAAAT TATGGCTATT CAAGAGCATT 1320 

CTTATTATGC TAGTTTTGGT TATCATGTCA CAAATTTTTT TGCACCAAGC AGCCGTTTTG 1380 
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GAACGCCCGA CGACCTTAAG TCTTTGATT6 ATAAAGCTCA TGAGCTAGGA ATTGTTGTTC 1440 

TCATGGACAT TGTTCACAGC CATGCATCAA ATAATACTTT AGATGGACTG AACATGTTTG 1500 

ACGGCACAGA TAGTTGTTAC TTTCACTCTG GAGCTCGTGG TTATCATTGG ATGTGGGATT 1560 

TCCGCCTCTT TAACTATGGA AACTGGGAGG TACTTAGGTA TCTTCTCTCA AATGCGAGAT 1620 

GGTGGTTGGA TGAGTTCAAA TTTGATGGAT TTAGATTTGA TGGTGTGACA TCAATGATGT 1680 

GTACTCACCA CGGATTATCG GTGGGATTCA CTGGGAACTA CGAGGAATAC TTTGGACTCG 1740 

CAACTGATGT GGATGCTGTT GTGTATCTGA TGCTGGTCAA CGATCTTATT CATGGGCTTT 1800 

TCCCAGATGC AATTACCATT GGTGAAGATG TTAGCGGAAT GCCGACATTT TGTGTTCCCG 1860 

TTCAAGATGG GGGTGTTGGC TTTGACTATC GGCTGCATAT GGCAATTGCT GATAMTGGA 1920 

TTGAGTTGCT CAAGAAACGG GATGAGGATT GGAGAGTGGG TGATATTGTT CATACACTGA 1980 

CAAATAGAAG ATGGTCGGAA AAGTGTGTTT CATACGCTGA AAGTCATGAT CAAGCTCTAG 2040 

TCGGTGATM AACTATAGCA TTCTGGCTGA TGGACAAGGA TATGTATGAT TTTATGGCTC 2100 

TGGATAGACC GTCAACATCA TTAATAGATC GTGGGATAGC ATTACACAAG ATGATTAGGC 2160 

TTGTAACTAT GGGATTAGGA GGAGAAGGGT ACCTAAATTT CATGGGAAAT GAATTCGGCC 2220 

ACCCTGAGTG GATTGATTTC CCTAGGGCTG AACAACACCT CTCTGATGGC TCAGTAATTC 2280 

CCAGAAACCA ATTCAGTTAT GATAAATGCA GACGGAGATT TGACCTGGGA GATGCAGAAT 2340 

ATTTAAGATA CCGTGGGTTG CAAGAATTTG ACCGGGCTAT GCAGTATCTT GMGATAAAT 2400 

ATGAGTTTAT GACTTCAGAA CACCAGTTCA TATCACGAAA GGATGAAGGA GATAGGATGA 2460 

TTGTATTTGA AAAAGGAAAC CTAGTTTTTG TCTTTAATTT TCACTGGACA AAAGGCTATT 2520 

CAGACTATCG CATAGGCTGC CTGAAGCCTG GAAAATACAA GGTTGCCTTG GACTCAGATG 2580 

ATCCACTTTT TGGTGGCTTC GGGAGAATTG ATCATAATGC CGAATATTTC ACCTTTGAAG 2640 

GATGGTATGA TGATCGTCCT CGTTCAATTA TGGTGTATGC ACCTAGTAGA ACAGCAGTGG 2700 

TCTATGCACT AGTAGACAAA GAAGMGAAG AAGAAGAAGA AGTAGCAGTA GTAGAAGAAG 2760 

TAGTAGTAGA AGAAGAATGA ACGAACTTGT GATCGCGTTG AAAGATTTGA ACGCCACATA 2820 

GAGCTTCTTG ACGTATCTGG CAATATTGCA TTAGTCTTGG CGGMTTTCA TGTGACAACA 2880 

GGTTTGCAAT TCTTTCCACT ATTAGTAGTG CAACGATATA CGCAGAGATG AAGTGCTGAA 2940 

CAAAAACATA TGTAAAATCG ATGAATTTAT GTCGAATGCT GGGACGATCG MTTCCTGCA 3000 

GCC 3003 
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>180 *190 *-200 *-210 ^220 

I YE I DPLLTNYRQHLDYRYSQYKKLREAI DKYEGGLEAFSRG YEKMGFTR 
: : : DP L. Y : H: . R.:Y. : I: KYEG LE. F: : GY K. GF. R 
LLNLDPTLEPYLDHFRHRMKRYVDQKMLI EKYEGPLEEFAQG YLKFGFNR 

*100 ^1 10 *120 *130 *140 

*-230 f240 ^250 f-260 f-270 

SATG I TYREWALGAQSAAL I GDFNNWDANAD I MTRNEFGVWE I FLPNNVD 
... I. YREWA : AQ. A. : 1GDFN. W: : : . : : M. : : : FGVW. I : P: VD 
EDGC I VYREWAPAAQEAEV IGDFNGWNGSNHMMEKDQFGVWS IR IPD-VD 

*-150 *-160 *170 *180 *190 

*-280 f290 *300 f310 ^-320 

GSPA IPHGSRVK I RMDTPSGV-KDS IPAWI NYSLQLPDE I --PYNG I HYD 
: . P. I PH. SRVK: R. . : GV D. IPAWI: Y: .:.. : PY: G: . D 
SKPV IPHNSRVKFRFKHGNGVWVDR IPAW IKYATADATKFAAPYDGV YWD 

*-200 *-210 *220 ^-230 *240 

*-330 *-340 *-350 *-360 ^370 

PPEEERY I FQHPRPKKPKSLRI YESHI GMSSPEPK 1 NSYVNFRDEVLPR I 
PP . ERY F: . PRP KP: : R I YE: H: GMSS: EP: : NSY : F D: VLPRI 
PP PS ER Y HF K YP RP PKP RAPR I YE AHV GMSSS EP RVN.SYREF AD D VL PR I 

*-250 *-260 *270 *280 *290 

*-380 f390 *-400 f410 *-420 

KKLGYNALQ 1 MA I QEHSYYASFGYHVTNFFAPSSRFGTPDDLKSL IDKAH 
K . YN: : Q: MAI EHSYY: SFGYHVTNFFA S: R: G. P: DLK L IDKAH 
KANNYNT VQLMA IMEHSYYGSFGYHVTNFFAVSNRYGNPEDLKYLIDKAH 

*-300 *-310 *-320 *-330 *340 

^430 f440 *450 f460 *470 

ELG I VVLMD I VHSHASNNTLDGLNMFDC- - -TDSC YFHSGARGYHWMWDS 
. LG: VL: D: VHSHASNN. DGLN FD : : . . YFH: G. RGYH : WDS 
SL GL Q VL VD V VHSHASN NV TDG LN GFD I G QGS QE S YF HA GERGYHKLWDS 

*-350 *360 *-370 *-380 ^390 

*-480 «490 *-500 ^510 ^520 

RLFNYGNWEVLRYLLSNARWWLDAFKFDGFRFDGVTSMMY IHHGLSVGFT 
RLFNY: NWEVLR: LLSN RWWL: . : : FDGFRFDG: TSM: Y: HHG: : : GFT 
RLFN YANWE VLRFLLSNLRWWLEEYNFDGFRFDG ITSMLYVHHG INMGFT 

MOO *410 *-420 *430 *-440 

f-530 f-540 f550 f-560 f-570 

GNYEEYFGLATDVDAVVYLMLVNDL IHGLFPDAI TIGEDVSGMPTFC I PV 
GNY: EYF: ATDVDAVVYLML. N: LIH : FPDA. . I: EDVSGMP. : . PV 
GNYNEYFSEATDVDAVV YLMLANNLIHK I FPDATVI AED VSGMPGLSRPV 

*-450 *460 *-470 *480 *-490 

*-580 f590 *600 ^610 «-620 

QEGGVGFDYRLHMA I ADKR I ELLK -KRDEDWRVGD I VHTLTNRRWSEKC V 
EGG: GFDYRL MAI: DK: I: LK K. DEDW. : ::. : LTNRR. : EKC: 
SEGG I GFDYRLAMA IPDKW IDYLKNKNDEDWSMKEVTSSLTNRRYTEKC I 

*-500 *-510 ^520 *530 *540 
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*630 ^640 ^650 ^660 *670 

SYAESHDQALVGDKT I AFWLMDKDMYDFMALDRPSTSL I ORG I ALHKM I R 
: YAESHDQ: : VGDKTIAF LMDK: MY. M: : : : : : DRGI ALHKM T 
AYAESHDQS I VGDKTI AFLLMDKEMYSGMSCLTDASPVVDRG I ALHKM I H 
*-550 ^560 *570 *-580 *590 

^680 ^690 *700 *710 *720 

LVTMGLGGEGYLNFMGNEFGHPEWI DFPRAEQHLSDGSV I PGNQFSYDKC 
: TM: LGGEGYLNFMGNEFGHPEWI DFPR GN' SYDKC 

FFTMALGGEGYLNFMGNEFGHPEWIDFPR EGNNWSYDKC 

*600 ^10 ^620 ^630 

^730 *740 ^750 f-760 ^770 

RRRFDLGDAEYLRYRGLQEFDRPMQYLEDKYEFMTSEHQF ISRKDEGDRM 
RR: . : L: D: E. LRY: : : . FDR: M: L: : K: . F: : S. . Q: : S. . D: ; : • • 
RR QW NLA DS EHL R Y KFM NA FDR AM NSL DE KFS FL ASG KQ I VSSMDDDNK V 
^640 ^650 *660 *670 ±680 

*780 ^790 ^-800 f810 *-820 

I VFEKGNLVFVFNFHWTKSYSDYR I ACLKPGKYKVALDSDDPLFGGFGRI 
: VFE: G: LVFVFNFH . : : Y. : Y: : : C PGKY: VAL: SD. FGG GR 

VVFE fiSk VFVFNFHPNNTYEGYKVGCDLp GKYRVALGSDAWEFGGHGRA 
*-690 *-700 *710 *720 *730 

*"830 ^840 *-850 *-860 

DHNAEYFT -FEGWYDDRPRS I MVYAPCKTAVVYALVDKEEEEE 

; L, H: • • • FT E. : : : RP. S: . V : P : T V. Y VD E 

GHDVDHFTSPEG IPGVPETNFNGRPNSFKVLSPARTC VA YYR VDERMSET 
^740 *750 *760 *770 *780 

EEEEEEV 

p" • • • • 
• > • . 

EDYQTDI 
*-790 
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f10 f-20 f-30 

MVYTLSGVRFPTVPSVYKSNGFSSNGDRRNANVS VFLKKH — SLSRK I LA 

™JJ : .?S\5 FP - :PS: - KS : • DRR. : : S FLK: : S: SR. L 

MVYT ISG IRFPVLPSLHKS TLRCDRRASSHSFFLKNNSSSFSRTSLY 

■10 *20 ^30 ^40 

•fSO ^60 «70 rSO ^90 

EKSSYNSEFRPSTVAASGKVLVPGTQSDSSSSSTDQFEFTETSPENSPAS 

• K S : SE : : ST: A. S: KVL: P. . Q D: S S : DQ: E . : . : : E" ■ 

AKFSRDSETKSSTIAESDKVLIPEDQ-DNSVSLADQLENPDITSEDAQNL 
,50 *60 «70 ^80 *90 

HOO f-110 f-120 ^130 H40 

TDVDSSTMEHASQI KTENDDVEPSSDLTGSVEELDFASSLQLQEGGKLEE 
. D: TM. • • S 

EDl " JMKDGNKYNI D -ESTS SY RE VGDEKG S V TSSSL VD V NTDTQ - - A 

^100 ^1 10 *120 *130 ^140 

H50 H60 ^170 ^180 *-190 

SKTLNTSEET 1 1 DESDR I RERG I PPPGLGQKI YE I DPLLTNYRQHLDYRY 

• KT S: . . : :. :I IPPPG GQK I YE I DPLL RQHLD- RY 
KKTSVHSDKKVKVDKPK 1 IPPPGSGQK I YE IDPLLQAHRQHLDFRY 

■f200 ?2W ?220 r 230 *2H0 

S 2^ KL o^ A ! DKYEGGLEAFSRGYEKMGFTRSA TGITYREWALGAQSAAL 
: QYK: : RE. I DKYEGGL: AFSRGYEK. GFTRSATG ITYREW: GA- SAAL 

!£ R a£ EE 'DK YEGGLDAFSRGYEKFGFTRSATGITYREWGPGAKSAAL 
M90 *200 ^210 *-220 *230 

>r250 f-260 ^270 ^280 ^290 

1 S£L^w DANAD 1 "TRNEFGVWE I FLPNN VDGSPA IPHGSRVK I RMDTPS 
: GDFNNW: : NAD: MT: : . FGVWE I FLPNN. DGSP: IPHGSRVK I - MDTPS 
VG DF NNWNP NAD VM TKD AF GVWE I FLP NN ADcIpi P 1 ^^PHgIrVK I hSStpI 
*240 ^-250 *260 *-270 *-280 

t300 ^310 *-320 *-330 ^340 

o V 5Sf ^awinyslqlpdeipyngihydppeeeryifqhprpkkpkslri 

g : T KDS PAWI: : S: Q P: EIPYNGI. YDPPEEE: Y: F: HP: PK: P: S: RI 
KDJHPAW IKFSVQAPGE IPYNG I YYDPPEEEKYVFKHPOPKRPQS IR I 
'290 *300 ^310 *320 *330 

f-350 ^360 *-370 *-380 ^390 

YESH I GMSSPEPK I NSYVNFRDEVLPR I KKLGYNALQ I MA I QEHSYY ASF 
YESH GMSSPEPKIN: Y. NFRD: VLPR I KKLGYNA: QIMAI QEHSYYASF 

YES SI G ^ SSPEPK I NTYANFRDDVLPR I KK LGYNAVQIMAIQEHSYYASF 
*340 ^350 ^360 *370 ^380 

^400 ,r410 ^420 ^430. ^440 

S3u)!Im^ A SS SRFGTPDDLKSL 1 DKAHELG 1 VVLM.D I VHSHASNNTLDG 
^SwJm^ A E?55E GTP: DLKSLID: AHELG: : VLMDIVHSH: SNNTLDG 
GYH aI^ FAPSSR ! GTPEDLKSLIDR AHELGLLVLMDIVHSHSSNNTLDG 
*390 MOO ^410 *420 *430 
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f450 *460 *470 <r480 ^490 

LNMFDCTDSC YFHSGARGYHWMWDSRLFNYGNWEVLRYLLSNARWWLDAF 
LNMFD TD: YFH: G: RGYHWMWDSRLFNYG: WEVLRYLLSNARWWLD. ■ 
LNMFDGTDGHYFHPGSRGYHWMWDSRLFNYGSWEVLR YLLSNARWWLDEY 
*-440 *-450 *460 *-470 *-480 

f500 ^510 . f520 ^530 *r540 

KFDGFRFDGVTSMMY IHHGLSVGFTGNYEEYFGLATDVDAVVYLMLVNDL 
KFDGFRFDGVTSMMY. HHGL V: FTGNY. EYFGLATDV: AVVY: MLVNDL 
KFDGFRFDG VTSMMYTHHGLQVSFTGNYSEYFGLATDVEAVV YMMLVNDL 
*490 *500 *510 *520 ^530 

r550 r 560 f570 f-580 ^590 

I HGLFPDA 1 T I GEDVSGMPTFC 1 PVQEGGVGFDYRLHMA I ADKR I ELLKK 
IHGLFP: A: : I GEDVSGMPTFC: P. Q: GG: GF: YRLHMA: ADK: I ELLKK 

IHGLFPEAVSIGEDVSGMPTFCLPTQDGGIGFNYRLHMAVADKWIELLKK 

*540 *550 *560 *570 ^580 

>r600 ^610 ^620 ^630 *6H0 

RDEDWRVGD I. VHTLTNRRWSEKCVSYAESHDQAL VGDKT I AFWLMDKDMY 
: DEDWR: GDIVHTLTNRRW EKCV YAESHDQAL VGDKT: AFWLMDKDMY 
QDEDWRMGD I VHTLTNRRWLEKCV VYAESHDQALVGDKTLAFWLMDKDMY 
*590 ^600 *610 *-620 ^630 

*-650 ^660 *670 *-680 r690 

DFMALDRPSTSLIDRGIALHKMIRLVTMGLGGEGYLNFMGNEFGHPEW1D 
DFMALDRPST: LIDRGIALHKMIRL: TMGLGGEGYLNFMGNEFGHPEWI D 
DFM itS5 PSTPLI ? RGIALHKMIRLI ™ GLG GEGYLNFMGNEFGHPEWID 
*640 ^650 *660 *670 ^680 

r700 *-710 .^720 ^730 ^740 

FPRAEQHLSDGSV I PGNQFSYDKCRRRFDLGDAEYLRYRGLQEFDRPMQY 
FPR: EQHL: : G. : : PGN: SYDKCRRRFDLGDA: YLRY: G: QEFDR: MQ 
FPRGEQHLPNGK I V PGNNNSYDKCRRRFDLGDAD YLR YHGMQEFDRAMQH 
^690 *700 *710 *720 *730 

f750 _ *760 *770 *780 v 790 

LEDKYEFMTSEHQF I SRKDEGDRM I VFEKGNLVFVFNFHWTKSYSDYR 1 A 
LE: . Y. FMTSEHQ: ISRK: EGDR: I: FE: : NLVFVFNFHWT: SYSDY • • 
LEETYGFMTSEHQY ISRKNEGDRV I IFERDNLVFVFNFHWTNSYSDYKVG 
*-740 *750 *760 *770 *-780 

^800 ^810 f820 ^830 *-840 

CLKPGKYKVALDSDDPLFGGFGR I DHNAEYFTFEGWYDDRPRS I MVYAPC 
CLKPGKYK: . LDSDD. LFGGF. R: : H. AEYFT EGWYDDRPRS: • VYAP 
CLKPGKYK I VLDSDDTLFGGFNRLNHTAEYFTSEGWYDDRPRSFLVYAPS 
*-790 *800 *810 *820 *830 

*-850 ^860 f-870 
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AAAAACCTCCTCCACTCAGTCiiFGGCTiCTCTCTCTCTCT 



72 tttctcttaattccaaccaBggBaatgaataaaaggat-a 

73 tttctcttaattccaaccaagg-aatgaataaaaggat-a 
71 tttctcttaattccaaccaagg-aatgaataaaaHgat-a 
165 tttctcttaattccaaccaagg-aatgaatbaaa0ga7fia 

191 tgtacaaatctaatggattcagcagtaatggtgatcggag 
191 tgtacaaatctaatggattcagcagtaatggtgatcggag 

189 TGTACAAATCTAATGGATTCAGCAGTAATGGTGATCGGAG 
274 TGTACAAATCTAATGGATTCAGCAGTAATGGTGATCGGAG 

311 AATTCCGACCTtCTACAGTTGCAGCATCGGGGAAAGTCCT 

311 AATTCCGACCTTCTACAGTTGCAGCATCGGGGAAAGTCCT 

309 AATgCCGACCTTCTAC/flTTGCAGCATCGGGGAAAGTCCT 

394 AATECCGACCTTCTACAGTTGCAGCATCGGGGAAAGTCCT 

431 CAGCATCAACTGATGTAGATAGTTCAACAATGGAACACGC 
431 CAGCATCAACTGATGTAGATAGTTCAACAATGGAACACGC 
429 CAGCATCAACTGATGTAGATAGTTCAACAATGGAACACGC 

514 cagcatcaactgatgtEgatagttcaacaatggaacacgc 

551 catcactacaactacaagaaggtggtaaactggaggagtc 

551 catcactacaactacaagaaggtggtaaactggaggagtc 

549 catcactacaactacaagaaggtggtaaactggaggagtc 

634 catcactacaactacaagaaggtggtaaactggaggagtc 

671 ttggtcagaagatttatgaaatagacccccttttgacaaa 

671 ttggtcagaagatttatgaaatagaccccgtttgacaaa 

669 ttgotcagaagatttatgaaatagacccccttttgacaaa 

754 ttggtcagaagatttatgaaatagacccccttttgacaaa 

791 aagcIttttctcgtggttatgaaaaaatgggtttcactcg 
791 aag ^ ttt t ctcgtggttatgaaaaaatgggtttcactcg 
789 aag c,nt ^ ctcgtggttatgaaabaatgggtttcactcg 

874 AAGCTTTTTCTCGTGGTTATGAAAAAATGGGTTTCACTCG 
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-Hgggccttgaactcagcaatttgacactcagttagttac 

tggggccttgaactcagcaatttgacactcagttagttac 

— -- tggggccttgaactcagcaatttgacactcagttagttac 

iesebehjtggggccttgaactcagcaatttgacagtcagttagttac 

gatttgtaaaaaccctaaggagagaagaagaaagatggtgtataBactctct 
gatttgtaaaaaccctaaggagagaagaagaaagatggtgtatacactctct 
gattt gtaaaaaccct aaggagagaagaagaaagatggtgtatacactctct 
gai i i cfihhh^aggagagaagaagaaagatggtgtatacactctct 

gaatgctaatgtttctgtattcttgaaaaagcactctctttcacggaagatc 
gaatgctaa tgttt ctgtattcttgaaaaagcactctgttcacggaagatc 
gaatgctaatfltttctgtattcttgaaaaaflcactctctttcacggaagatc 
gaatgctaatgtttctgtattcttgaaaaagcactctctttcacggaagatc 

tgtgcctggaaRccagagtgatagctcctcatcctcaacagaccaatttgag 

TGTGCCTGGAAfiCCAGAGTGATAGaCCTCATCCTCAACAGACCAATTTGAG 

tgt^cctggaatccagagtgatagctcctcatcctcaacag/|icaatttgag 
tgtEcctggaatccagagtgatagctcctcatgctcaacagaccaatttgag ► 

tagccagattaaaactgagaacgatgacgttgagccgtcaagtgatcttaca 

TAGCCAGATTAAAACTGAGAACGATGACGTTGAGCCGTCAAGTGATCTTACA 

tagccagattaaaactgagaacgatgacgttgagccgtcaagtgatcttaca 
tagccagattaaaactgagaacgatgacgttgagccgtcaagtgatcttaca 

taaaacattaaatacttctgaagagacaattattgatgaatctgataggatc 
taaaacattaaatacttctgaagagacaattattgatgaatctgataggatc 
taaaacattaaatacttctgaagagacaattattgatgaatctgataggatc 
taaaacattaaatacttctgaagagacaattattgatgaatctgataggatc 

ctatcgtcaacaccttgattacaggtattcacagtacaagaaactgagggag 
ctatcgtcaacaccttgattacaggtattcacagtacaagaaactgagggag 
ctatcgtcaacaccttgattacaggtattcacagtacaagaaactgagggag 
ctatcgtcaacaccttgattacaggtattcacagtacaagaaaBtgagggag 

tagtgctacaggtatcacttaccgtgagtgggctcctggtgcccagtcagct 
tagtgctacaggtatcacttaccgtgagtgggctcBtggtgcccagtcagct 
tagtgctacaggtatcacttaccgtgagtgggctcctggtgcccagtcagct 
tagtgctacaggtatcacttaccgtgagtgggctcctggtgcccagtcagct, 
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ACTCCTATCACTTATCAGATCTCTATTT llcon.seq 
ACTCCTATCACTTATCAGATCTCTATTT 19con.seq 
ACTBcHaTCACTTATCAGATCTCTATTT 10con.seq 
ACTCCTATCAOBaTCAGATCTCTATTT psbe2con . seq 

GGAGTTCGTTTTCCTACTGTTCCATCAG llcon.seq 
GGAGTTCGTTTTCCTACTGTTCCATCAG 19con.seq 
GGAGTTCGTTTTCCTACTGTTCCATCAG 10con.seq 
GGAGTTCGTTTTCCTACTGTTCCATCAG psbe2con . seq 

TTGGCTGAAAAGTCTTCTTACAATTCCG llcon . seq 
TTGGCTGAAAAGTC I I C J I ACAATTCCG 19con.seq 
TTGGCTGAAAAGTCTTCTTACAATTCCG 10con.seq 
TTGGCTGAAAAGTCTTCTTAcEaTTCCG psbe2con . seq 

TTCACTGAGACATCTCCAGAAAATTCCC llcon . seq 
TTCACTGAGACATCTCCAGAAAATTCCC 19con.seq 
TTCSCTGAGACATCTCCAGAAAATTCCC 10con.seq 
TTCACTGAGAC/BCTCCAGAAAATTCCC psbe2con . seq 

GGAAGTGTTGAAGAGCTGGATTTTGCTT llcon.seq 
GGAAGTGTTGAAGAGCTGGATTTTGCTT 19con.seq 
GGAAGTGTTGAAGAGCTGGATTTTGCTT 10con . seq 
GGAAGTGTTGAAGACfjTGGATTTTGCTT psbe2con.seq 

AGAGAGAGGGGCATCCCTCCACCTGGAC llcon.seq 
AGAGAGAGGGGCATCCCTCCACCTGGAC 19con . seq 
AGAGAGAGGGGCATCCCTCCACCTGGAC 10con . seq 
AGAGAGAGGGGCATCCCTCCACCTGGAC psbe2con.seq 

GCAATTGACAAGTATGAGGGTGGTTTGG llcon. seq 
GCAATTGACAAGTATGAGGGTGGTTTGG 19con . seq 
GCAATTGACAAGTATGAGGGTGGTTTGG 10con .seq 
GCAATTGACAAGTATGAGGGTGGTTTGG psbe2con.seq 

GCCCTCATTGGAGATTTCAACAATTGGG llcon . seq 
GCCCTCATTGGAGATTTCAACAATTGGG 19con . seq 
GCCCTCATTGGEgATTTCAACAATTGGG 10con . seq 
GCBCTCATTGGAGATTTCAACAATTGGG psbe2con . seq F jg. 8 
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910 ACGCAAATGCTGACATTATGACTCGGAATGAATTTGGTGTC 

911 ACGCAAATGCTGACATTATGACTCGGAATGAATTTGGTGTC 
909 ACGCAAATGCTGA(flTTATGACTCGGAATGAATTTGGTGTC 
994 ACGCAAATGCTGACATTATGACTCGGAATGAATTTGGTGTC 

1030 CTCCATCAGGTGTTAAGGATTCCATTCCTGCTTGGATCAAC 

1031 CTCCATCAGGTGTTAAGGATTCCATTCCTGCTTGGATCAAC 
1029 nXCATCAGGTGTTAAGGATTCCATTCCTGCTTGGATCAAC 

1114 ctHcatcaggtgttaaggattccattcctgcttggatcaac 

1150 aacacccacggccaaagaaaccaaagtcgctgagaatatat 

1151 aacacccacggccaaagaaaccaaagtcgctgagaatatat 
1149 aacacccacggccaaagaaaccaaagtcgBtgagaatatat 
1234 aacacccacggccaaagaaaccaaagtcgctgagaatatat 

1270 taaaaaa-gcttgggtacaatgcgHtgcSaattatggctat 

1271 taaaaaa- gcttgggtacaatgcghtgcaaattatggctat 
1269 taaaaaaHgcttgggtacaatgcggtgcaaattatggctat 
1354 taaaaa/sbfcttgggtacaatgcggtgcaaattatggctat 

1389 gacgaccttaagtctlbgattgataaagctcatgagaagg 

1390 GACGACCTTAAGTCTTTGATTGATAAAGCTCATGAGCTAGG 
1389 GACGACCTTAAGTCTTTGATTGATAAAGCTCATGAGCTAGG 
1473 GACGACCTTAAGTCTTTGATTGATAAAGCTCATGAGCTAGG 

1509 GATAGTTGTTACTTTCACTCTGGAGCTCGTGGTTATCATTG 

1510 GATAGTTGTTACTTTCACTCTGGAGCTCGTGGTTATCATTG 
1509 GATAGTTGTTACTTTCACTCTGGAGCTCGTGGTTATCATTG 
1593 GATAGTTGTTACTTTCACTCTGGAGCTCGTGGTTATCATTG 



1628 GATGAGTTCAAATTTGATGGA 
1630 GATGBGTTCAAATTTGATGGA 

1629 GATGAGTTCAAATTTGATGGA 
1713 GATGAGtEcAAATTTGEtGGA 



AGATTfiGATGGTGTGAC 
AGATTTGATGGTGTGAC 
AGATTTGATGGTGTGAC 
AGATTTGATGGTGTGAC 



1748 GTGGATGCTGTTGTGTATCTGATGCTGGTCAACGATCTTAT 
1750 GTGGATGCTGTTGTGTATCTGATGCTGGTCAACGATCTTAT 

1749 GJGGATGCTGTTGTGTATCTGATGCTGGTCAACGATCTTAT 
1833 GiSGATGCTGffiGTGTATCTGATGCTGGECAACGATCTTAT J 
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TGGGAGA 
TGGGAGA 

tgSgaga 

TGGGAGA 



CTGCCAAATAATGTGGATGGTTCTCCTGCAATTC ^\ 
CTGCCAAATAATGTGGATGGTTCTCCTGCAATTC 
CTGCCAAATAATGTGGATGGTTCTCCTGCAATTC 
CTGCCAAATAATGTGGATGGTTCTCCTGCAATTC 



TACTCTTTACAGCTTCCTGATGAAATTCCATATAATGGAATATATT 

tactctttacagcttcctgatgaaattcgatataatggaataBatt 
tactctttacagcttcctgatgaaattccatataatggaatatatt 

TACTCTTTACAGCTTCCTGATGAAATTCCATATAATGGAATATATT 
GAATCTCATATTGGAATGAGTAGTCCGGAGGCTAAAATTAACTCAT 

gaatctcatattggaatgagtagtccggagcctaaaattaactcat 
gaatct catattggaatgagtagtccggagcctaaaattaagtcat 
gaatctcatattggaatgagtagtccggagcctaaaattaactcat 

tcaagagcattcttattatgctagttttggttatcatgtcacaaat 
tcaagagcattcttatt^gctagttttggttatcatgtcacaaat 
tcmgagcattc™totgctagttttggttatcatgtcacaaat 
tcaagagcattcttattatgctagttttggttatcatgtcacaaat 

aattgttgttctcatggacalflgttcacagccatgcatcaaataat 
aattgttgttctcatggacattgttcacagccatgcatcaaataat 
aattgttgttctcatggacattgttcacagccatgcatcaaataat 
aattgttgttctcatggacattgttcacagccatgcatcaaataat 

gatgtgggattflccgcctctttaactatggaaactgggaggtactt 
gatgtgggattcccgcctctttaactatggaaactgggaggtactt 
gatgtgggattBccgcctctttaactatggaaactgggaggtactt 
gatgtgggattcccgcctctttaactatggaaactgggaggtactt 

atcaatgatgtatactcaccacggattatcggtgggattcactggg 
atcaatgatgtataBtcaccacggattatcggtgggattcactggg 
atcaatgatgtBtactcaccacggattatcggtgggattcactggg 
atcaatgatgtatactcaccacggattatcggtgggattcactggg 

tcatBggcttttcccagatgcaa™ccattggtgaagatgttagc , 
tcatgggc tttt cccagatgcaattaccattggtgaagatgttagc 
tcatgggc tttt cccagatgcaattaccattggtgaagatgttagc 
tcatgggcttttcccagatgcaattaccattggtgaagatgttagcj 
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CTCATGGGTCCAGAGTGAAGATACGTATGGACA llcon . seq 
CTCATGGGTCCAGAGTGAAGATACGTATGGACA 19con . seq 
CTCATGGGTCCAGAGTGAAGATACGTATGGACA 10con . seq 
CTCATGGGTCCAGAGTGAAGATACCfi ATGGACA psbe2con . seq 

ATGATCCACCCGAAGAGGAGAGGTATATCTTCC llcon . seq 
ATGATCCACCCGAAGAGGAGAGGTATATCTTCC 19con . seq 
ATGATCCACCCGAAGAGGAGAGGTATATCTTCC 10con . seq 
ATGATCCACCCGAAGAGGAGAGGTATEtCTTCC psbe2con . seq 

ACGTGAA TTTT AGAGATGAAGTTCTTCCTCGCA llcon . seq 
ACGTGAATTTTAGAGATGAAGTTCTTCCTCGCA 19con . seq 
ACGTGAATTTTAGAGATGAAGTTCTTCCTCGCA 10con . seq 
ACGTGAATTTTAGAGATGAAGTTCTTCCTCGCA psbe2con . seq 

1 1 I I I I GCACCAAGCAGCCGTTTTGGAACGCCC llcon. seq 
inn I GCACCAAGCAGCCGTTTTGGAACGCCC 19con . seq 
i i I ) I l GCACCAAGCAGCCGTTTTGGAACGCCC 10con.seq 
mm I l GCACCAAGCAGCCGTTTTGGAACGCCC psbe2con . seq 

ACTTTAGATGGACTGAACATGTTTGACGGCACC llcon. seq 
ACTTTAGATGGACTGAACATGTTTGAcBgCACC 19con.seq 
ACTTTAGATGGACTGAACATGTTTGACGGCAC0 10con . seq 
ACTTTAGATGGACTGAACATGTTTGACGGCAcH psbe2con.seq 

AGGTATCTTCTCTCAAATGCGAGATGGTGGTTG llcon. seq 
AGGTATCTTCTCTCAAATGCGAGATGGTGGTTG 19con . seq 
AGGTATCTTCTCTCAAATGCGAGATGGTGGTTG 10con . seq 
AGGTATCTTCTCTCAAATGCGAGATGGTGGTTG psbe2con . seq 

AACTACGAGGAATACTTTGGACTCGCAACTGAT llcon. seq 
AACTACGAGGAATACTTTGGACTCGCAACTGAT 19con . seq 
AACTACGAGGAATACTTTGGACTCGCAACTGAT 10con . seq 
AACTACGAGGAATACTTTGGACTCGCAACTGAT psbe2con . seq 

GGAATGCCGACATTTTGTATTCCCGTTCAAGAT llcon. seq 
GGAATGCCGACATTTTGTATTCCCGTECAAGAB 19con.seq r- n 

GGAATGCCGACATTTTGTBTTCCCGTTCAAGAT 10con.seq 9* 
GGAATGCCGACATTTTGTATTCCCGTTCAAGAT psbe2con . seq SHEET 6 
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1868 GGGGGTGTTGGCTTTGACTATCGGCTGCATATGGCAATTGC^ 
1870 GGGGGTGTTGGCTTTGACTATCGGCTGCATATGGCAATTGC 

1869 GGGGGTGTTGGCTTTGACTATCGGCTGCATATGGCAATTGC 
1953 GGGGGTGTTGGCTTTGACTATCGGCTGCATATGGCAATTGC 

1988 AGATGGTCGGAAAAGTGTGTTTCATAGGCTGAAAGTCATGA 
1990 AGATGGTCGGAAAAGTGTGTTTCATACGCTGAAAGTCATGA 

1989 AGATGGTCGGAAAAGTGTGTTTCATACGCTGAAAGTCATGA 
2073 AGATGGTCGGAAAAGTGTGTTTCATACGCTGAAAGTCATGA 

2108 ccgE CAACATCATTAATAGATCGTGGGATAGCATTGCACAA 
2110 CCGTCAACATCATTAATAGATCGTGGGATAGCATTGCACAA 

2109 CCGTCAACATCATTAATAGATCGTGGGATAGCATT0CACAA 

2193 ccgtcaacatcattaatagatcgtgggatagcattgcacaa 

2228 tggattgatttccctagggctgaecbacacctlitctgatgg 
2230 tggattgatttccctagggctgaacaacacctctctgatgg 

2229 tggattgatttccctagggctgaagaacacctctctgatgg 
2313 tggattgatttccctagggctgaacaacacctctctgatgg 

2348 taccbtgggtle caagaatttg a<figggctatgcagtatct 
2350 taccgtgggttgcaagaAtttgaccggSctatgcagtatct 

2349 taccgtgggttgcaagaatttgaccgggctatgcagtatct 
2433 taccgtgggttgcaagaatttgaccgggctatgcagtatct 

2468 GAAaBaGGAAACCTAGTTTTSgTCTTTAATTTTCACTGGAC 

2470 gaaaaaggaaacctag i i i i i gtctttaattttcactggac 

2469 gaaaaaggaaacctagtttttgtctttaattttcactggac 
2553 gaaaaaggaaacctagtttttgtctttaattttcactggac 

2588 tttggtggcttcgggagaattgatcataatgccgaatattt 
2590 tttggtggcttcgggagaattgatcataatgccgaatattt 

2589 tttggtggcttcgggagaattgatcataatgccgaatattt 
2673 tttggtggcttcgggagaattgatcataatgccgaatBttt 

2708 CTAGTAGACAAyfijAGAACfiBgBBBBSSB B 

2710 CTAGTAGACAAAGAAGAAGAAGAAGAAGAAGAA Gyiftftfiyifc^ 

2709 CTAGTAGACAAAGAAGAAGAAGAAGAAGAAGAAG- 

2793 CTAGTAGACAAAGAAGAAGAAGAAGAAGAA< 
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TGATAAATGGATTGAGTTGCTCAAGAAACGGGATGAGGATTGGAGA^ 
TGATAAAHGGATTGAGTTGCTCAAGAAACGGGATGAGGATTGGAGA 
TGATAAATGGATTGAGTTGCTCAAGAAACGGGATGAGGATTGGAGA 
TGATAAATGGATTGAGTTGCTCAAGAAACGGGATGAGGATTGGAGA 

TCAAGCTCTAGTCGGTGATAAAACTATAGCATTCTGGCTGATGGAC 
TCAAGCTCTAGTCGGTGAtAAAACTATAGCATTCTGGCTGATGGAC 
TCAAGCTCTAGTCGGTGATAAAACTATAGCATTCTGGCTGATGGAC 
TCAAGCTCTAGTCGGTGATAAAACTATAGCATTCTGGCTGATGGAC 

GATGATTAGGCTTGTAACTATGGGATTAGGAGGAGAAGGGTACCTA 
GATGATTAGGCTTGTAACTATGGGATTAGGAGGAGAAGGGTACCTA 
GATGATTAGGCTTGTAACTATGGGATTAGGAGGAGAAGGGTACCTA 
GATGATTAGGCTTGTAACTATGGGATTAGGAGGAGAAGGGTACCTA 

CTCAGTAATTCCCGGAAACCAATTCAGTTATGATAAATGCAGACGG 
CTCAGTAATflCCCGGAAACCAATTCAGTTATGATAAATGCAGACGG 

ctcagtaattcccBgaaaccaattcagttatgataaatgcagacgg 

CTCAGTAATTCCCGGAAACCAATTCAGTTATGATAAATGCAGACGG 



TGAAGATAAATATGAGTTTATGACTTCAGAACACCAGTTCATATCA 

tgaagataaatatgagtttatgacttcagaacaccagttcatatca 
tgaagataaatatgagtttatgacttcagaacaccagttcatatca 
tgaagataaatatgagtttatgacttcagaacaccagttcatatga 

aa^jagctattcagactatcgcataggctgcctgaagcctggaaaa 
aaaaagctattcagactatcgcatagJJctgcctgaagcctggaaaa 
aaaaggctattcagactatcgcataggctgcctgaagcctggaaaa 
aaaaagctattcagactatcgcataggctgEctgaagcctggaaaa 

cacctEtgaaggatSgtatgatgatcgtcctHgttcaattatggtg 
cacctttgaaggatggtatgatgatcgtgctcgttcaattatggtg 
cacctttgaaggatggtatgatgatcgtcctcgttcaattatggtg 
cacctttgaaggatggtatgatgatcgtcctcgttcaattatggtg 
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GTGGGTGATATTGTTCATACACTGACAAATAGA llcon . seq 
GTGGGTGATATTGTTCATACACTGACAAATAGA 19con . seq 
GTGGGTGATATTGTTCATACACTGACAAATAGA 10con . seq 
GTGGGTGATATTGTTCATACACTGACAAATAGA psbe2con . seq 

AAGGATATGTATGATTTTATGGCTCTGGATAGA llcon . seq 
AAGGATATGTATGATTTTATGGCTCTGGATAGA 19con .seq 
AAGGATATGTATGATTTTATGGCTCTGGATAGA 10con . seq 
AAGGATATGTATGATTTTATGGCllTGGATAGA psbe2con . seq 

AATTTCATGGGAAATGAATTCGGCCACCCTGAG llcon . seq 
AATTTCATGGGAAATGAATTCGGCCACCCTGAG 19con . seq 
AATTTCATGGGAAATGAATTCGGCCACCCTGAG 10con . seq 
AATTTCATGGGAAATGAATTCGGCCACCCTGAG psbe2con . seq 

AG ATTTG AC CTG GGAG ATGCA G AATATTTA AG A llcon . seq 
AGATTTGACCTGGGAGATGCAGAATATTTAAGA 19con . seq 
AGATTTGACCTGGGAGATGCAGAATATTTAAGA 10con.seq 
AGATTTGACCTGGGAGATGCAGAATATTTAAGA psbe2con . seq 

CGAAAGGATGAAGGAGATAGGATGATTGTATTT llcon. seq 
CGAAAGGATGAAGGAGATAGGATGATTGTATTT 19con . seq 
CGAAAGGATGAAGGAGATAGGATGATTGTATTT 10con . seq 
CGAAAGGATGAAGGAGATAGGATGATTGTATTT psbe2con . seq 

TACAAGGTTGlJCTTGGACTCAGATGATCCACTT llcon . seq 
TACAAGGTTGCCTTGGACTCAGATGATCCACTT 19con . seq 
TACAAGGTTGCCTTGGACf CAGATGATCCACTT 10con . seq 
TACAAGGTTGCCTTGGACTCAGATGATCCACTT psbe2con . seq 

TATGCACCTAGTAGAACAGCAGTGGTCTATGCA llcon . seq 
TATGCACCTiiGTABAACAGCAGTGGTCTATGCA 19con.seq 
TATGCACCTAGTAGAACAGCAGTGGTCTATGCA 10con . seq 
TATGCACCTAGTAGAACAGCAGTGGTCTATGCA psbe2con . seq 

AACTTGTGATCGCGTTGAAAGATTTGAACG333 llcon . seq 
AACTTGTGATCGCGTTGAAAGATTTGAACG — 19con . seq 
AACTTGTGATCGCGTTGAAAGATTTGAACG — 10con . seq 
AACTTGTGATCGCGTTGAAAGATTTGAACG— psbe2con . seq 
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2795 lilittAfflcflAaTAGAGC 



TTCTTGAt 



" CTACATAGAGCTTCTTGACGTATCTGGCAATAT 
jilt - WCATAGAGCTTCTTGACGTATCTGGCAATAT 
^ 895 — -CTACATAGAGCTTCTTGACGTATCTGGCAATAT 

2924 ^r^SS GAACAAA ^ CATATCTMA ^ 
3005 aSa A ™ AACAA ^ 

3005 agagatgaagtgctgaacaaa-catatgtaaaatcgatgaa 
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2975 
3012 
3003 
3123 



GCCC 
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^tcagtcttggcggaattScatgtgaca^HaaggtttgcaBtt " 

TGCAKAGTCTTGGCGGAAmCATGTGACAg-AAGGTTTGCAATT 

tgcatBagtcttggcggaatttcatgtgacaa-Saggtttgcaatt 

TGCATCAGTCTTGGCGGAATTTCATGTGACAA-AAGGTTTGCAATT 

TTTATGTCGAATGCTGGGACGATCGAATTCCTGCAGCC 
TTTATGTCGAATGCTGGGACGATCGAATTCCTGCAG 
TTTATGTCGAATGCTGGGACGAT CGAATT CCTGCAGCC 
TmTGTCGAATGCTGGGACGEc flBgB cffi GiliBEiZSB 
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ctttccactattagtagtGcaBcgatatacgc llcon.seq 

CTTTCCACTATTAGTAGTGCAACGATATACGC 19con . seq 
CTTTCCACTATTAGTAGTGCAACGATATACGC 10con .seq 
CTTTCCACTATTAGTAGTGCAACGATATACGC psbe2con . seq 



GTTCTGTAAATTGTCATCTCTTTANATGTACA 



llcon . seq 
19con.seq 
10con.seq 
psbe2con.seq 



AAAAAAAAAAAAAACTCGA 



llcon.seq 
19con.seq 
lOcon.seq 
psbe2con.seq 
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GGATGCTAATGTTTCTGTATTCTTGAAAAAGCACTCTCTTTCACGG 

I 1 ' 1 -H 1 »f*H 1 1 , I , , , , I , 

CCTACGATTACAAAGACATAAGAACTTTTTCGTGAGAGAAAGTGCC 
A N V S V F L K K H S L S R 

TTCTACAGTTGCAGCATCGGGGAAAGTCCTTGTGCCTGGAAYCCAG 
~* 1 1 1 1 I 1 H — 1 1 i i i i | ■ . _ 

AAGATGTCAACGTCGTAGCCCCTTTCAGGAACACGGACCTTRGGTC 

S T V A A S G K V L V P G ? Q 

GACATCTCCAGAAAATTCCCCAGCATCAACTGATGTAGATAGTTCA 

1 1 ~ H 1 | h— 

CTGTAGAGGTCTTTTAAGGGGTCGTAGTTGACTACATCTATCAAGT 
T SP ENS PAST DVDSS 

TGAGCCGTCAAGTGATCTTACAGGAAGTGTTGAAGAGCTGGATTTT 
1 J— ' ' 1 ■ ■ i 1 1 ■ 1 

ACTCGGCAGTT.CACTAGAATGTCCTTCACAACTTCTCGACCTAAAA 
EPSSDLTGS VE ELDF 

TAAAACATTAAATACTTCTGAAGAGACAATTATTGATGAATCTGAT 
1 — — \ 1 1 -h 1 h— H +^ 

ATTTTGTAATTTATGAAGACTTCTCTGTTAATAACTACTTAGACTA 

K T L N T S E E T I I D E S D 

.Hinc II 

GATTTATGAAATAGACCCCCTTTTGACAAACTATCGTCAACACCTT 
^ 1 i " -t— 1 — . — H — : H 1— •- 

CTAAATACTTTATCTGGGGGAAAACTGTTTGATAGGAGTTGTGGAA 
JYEIDP LL TNYRQ HL-/ 
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; Bgi II 

AAGATCTTGGCTGAAAAGTCTTCTTACAATTCCGAATCCCGACC 

1 1 ' ' ' — 111 1 1 1 1 ' i .... i 1 — *+ 90 

TTCTAGAACCGACTTTTCAGAAGAATGTT.AAGGCTTAGGGCTGG 
K I L AEKS. SY NSE SRP 

AGTGATAGCTCCTCATCCTCAACAGACCAATTTGAGTTCACTGA 

1 1 1 I i ' I ' 1 1 I 1 — — - h< H- 180 

TCACTATCGAGGAGTAGGAGTTGTCTGGTTAAACTCAAGTGACT 

SDSSSSSTDQFEFTE 

ACAATGGAACACGCTAGCCAGATTAAAACTGAGAACGATGACGT 

— < — I 1 1 1 1 1 ■ ' ■ 1 — H— — —>-+-> H "H — f- 270 

TGTTACCTTGTGCGATCGGTCTAATTTTGACTCTTGCTACTGCA 
T M E H A S Q I K T E N D D V 

GCTTCATCACTACAACTACAAGAAGGTGGTAAACTGGAGGAGTC 

h 1 ■ | ; . ' . — i ' ' ' | — i , | i , , — i — — + 360 

CGAAGTAGTGATGTTGATGTTCTTCCACCATTTGACCTCCTCAG 
ASSLQLQEGGKLE E S 

AGGATCAGAGAGAGGGGCATCCCTCCACCTGGACTTGGTCAGAA 
H I 1 H »+— — I I i|50 

TCCTAGTCTCTCTCCCCGTAGGGAGGTGGACCTGAACCAGTCTT 
R I RERG I PPP GLGQK 



GATTACAGGTATTCACAGTACAAGAAACTGAGGGAGGCAATTGA 

i 1 I 1 ' I ■ ' 1 ' 1 — 1 — h- 1 — I- 540 

CTAATGTCCATAAGTGTCATGTTCTTTGACTCCCTCCGTTAACT 

DYRYSQYKKLREAID 

Fig. 9 SHEET 2 
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CAAGTATGAGGGTGGTTTGGAAGCTTTTTCTCGTGGTTATGAAAAA^ 
' H — ■ — ■ i . i . . — M H h- h — 11 ' i ■ 

GTTCATACTCCCACCAAACCTTCGAAAAAGAGCACCAATACTTTTT 
K Y E G G L E A F S R G Y E K 

Pvu II 

GGCTCCTGGTGCCCAGTCAGCTGCCCTCATTGGAGATTTCAACAAT 
1 1 : ' I" 1 | ... i h — 1 i ■ 

CCGAGGACCACGGGTCAGTCGACGGGAGTAACCTCTAAAGTTGTTA 

A P G A Q S A A L I G D F N N 



CTGGGAGATTTTTCTGCCAAATAATGTGGATGGTTCTCCTGCAATT 

T 1 • H 1 1 ' I 1 h— 

GACCCTCTAAAAAGACGGTTTATTACACCTACCAAGAGGACGTTAA 
W E I F L P N N V D G S P A I 

TGTTA AGGATTCCATTCCTGCTTGGATCAACTACTCTTTACAGCTT 
1 1 " 1 H 1 1 H~ 

ACAATTCCTAAGGTAAGGACGAACCTAGTTGATGAGAAATGTCGAA 

V K °S I PAW I NY SLQ L 

AGAGGA GAGGTATRTCTTCCAACACCCACGGCCAAAGAAACCAAAG 
1 f~ 1 1 1 1 1 1 — h— — ^ 

TCTCCTCTCCATAYAGAAGGTTGTGGGTGCCGGTTTCTTTGGTTTC 

E E R Y ? F Q H P R P K K P K - 
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ATGGGTTTCA CTCGTAGTGCTACAGGTATCACTTACCGTGAGTG 

' 1 ' 1 ' I 1 i i | . i i i «_f 630 

TACCCAAAGTGAGCAT.CACGATGTCCATAGTGAATGGCACTCAC 



"GFTRSATGIT YR 



E W 



TGGGACGCAAATGCTGACATTATGACTCGGAATGAATTTGGTG T 

ACCCTGCGTTTACGACTGTAATACTGAGCCTTACTTAAACCACA ?2 ° 
W D A N A D I M T R N E F G V 

CCTCATGGGTCCAGAGTGAAGATACGYATGGACACTCCA TCAGG 

GGAGTACCCAGGTCTCACTTCTATGCRTACCTGTGAGGTAGTCC 8 10 
P H G S R V K I R M D T P S G 

CCTGATGAAATTCCATATAATGGAATATATTATGATCCACCC GA 

GGACTACTTTAAGGTATATTACCTTATATAATACTAGGTGGGCT 9 °° 
P D E • I P Y N G I Y Y D P P E 

TCGCTGAGAATATATGAATCTCATATTGGAATGAGTAGTCCG GA 

AGCGACTCTTATATACTTAGAGTATAACCTTACTCATCAGGCCT "° 
SLR I YE SHIGMSSPE 
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pCmnl 

GCCTAAAATTAACTCATACGTGAATTTTAGAGATGAAGTTCTTCCT^ 
1 1 1 ■ i - ' 1 I . I H 1—*— ► 

CGGATTTTAATTGAGTATGCACTTAAAATCTCTACTTCAAGAAGGA 
PKINSYVNFRD E V LP 

TCAAGAGCATTCTTATTATGCTAGTTTTGGTTATCATGTCACAAAT 

H ' 1 ■ I I I I I I | , 1— r- 

AGTTCTCGTAAGAATAATACGATCAAAACCAATAGTACAGTGTTTA 
Q E H S Y Y A S F G Y H V T N 

GTCTTTGATTGATAAAGCTCATGAGCTAGGAATTGTTGTTGTCATG 
1 • * >>• - * H— ' ' 1 1 1 

CAGAAACTAACTATTTCGAGTACTCGATCCTTAACAACAAGAGTAC 

S L I D K A H E L G I V V L M 



GAACAT GTTTGACGGCACAGATAGTTGTTACTTTCACTCTGGAGCT 

' ■ 1 ' ' 1 I i 1 .... | — •— I , — f_ — 4— 

CTTGTACAAACTGCCGTGTCTATCAACAATGAAAGTGAGACCTCGA 
N M F D G T O S C Y F H S G A 

AAACTGGGAGGTACTTAGGTATCTTCTCTCAAATGCGAGATGGTGG 
" "* ' 1 I 1 1 1 1 * 4 1 | , , ! . 

TTTGACCCTCCATGAATCCATAGAAGAGAGTTTACGCTCTACCACC 

N w E V L RYLLSNA R WW 

ATCAATGATGTATACTCACCACGGATTATCGGTGGGATTCACTGGG 
— - H H ' i 1 1 1—> 1 

TAGTTACTACATATGAGTGGTGCCTAATAGCCACCCTAAGTGACCC 

S . M M Y THHGLSVGFT G. 
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CGCATAAAAAASCT TGGGTACAATGCGGTGCAAATTATGGCTAT 

1 1 H 1 1 1 ' I I 1080 

GCGTATTTTTTSGAACCCATGTTACGCCACGTTTAATACCGATA 

R I K?LGYNAV QIM AI 

TTTTTTGCACCAAGCAGCCGTTTTGGAACGCCCGACGACCTTAA 

— — ' — — H 1 1 — H — — - . — (- ] 170 

AAAAAACGTGGTTCGTCGGCAAAACCTTGCGGGCTGCTGGAATT 

F F A P S S R F G T P D-D L K 

GACATTGT TCACAGCCATGCATCAAATAATACTTTAGATGGACT 

' 1 1 11 1 ) I ' • i i ■ ■ — h- 1260 

CTGTAACAAGTGTCGGTACGTAGTTTATTATGAAATCTACCTGA 
D I V H S H A S N N T L D G L 

Sac I 

CGTGGTTATCA TTGGATGTGGGATTCCCGCCTCTTTAACTATGG 

1 ' ' 1 ' ' I ' ' 1 ,1 I " i | | 1 350 

GCACCAATAGTAACCTACACCCTAAGGGCGGAGAAATTGATACC 
RGYH WMWD SR LFNYG 

TTGGATGAGTTCAAA TTTGATGGATTTAGATTTGATGGTGTGAC 

1 i i | , i i | i i , | ] 4no 

AACCTACTCAAGTTTAAACTACCTAAATCTAAACTACCACACTG 
LDEFKFDGFRFDGVT 

AACTACGAG GAATACTTTGGACTCGCAACTGATGTGGATGCTGT 

TTGATGCTCCTTATGAAACCTGAGCGTTGACTACACCTACGACA 
NY EEYFGLATDVDAV 
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Hinc II 

TGTGTATCTGATGCTGGTCAACGATCTTATTCACGGGCTTTTCCCA^ 
i— • — 1 — — * 1 H 1 1 1 1 1— 

ACACATAGACTACGACCAGTTGCTAGAATAAGTGCCCGAAAAGGGT 
VYLMLVNDL I H GLFP 



TTGTATTCCCGTTCAAGATGGGGGTGTTGGCTTTGACTATCGGCTG 
" ~H — i 1 — i 1 1 I — i (-*■ 

AACATAAGGGCAAGTTCTACCCCCACAACCGAAACTGATAGCCGAC 
CIPVQ DGGVGFD YRL 

GGATGAGGATTGGAGAGTGGGTGATATTGTTCATACACTGACAAAT 
1 1 — I I 1 ■ i ■ 



CCTACTCCTAACCTCTCACCCACTATAACAAGTATGTGACTGTTTA 
DEDW R VGD I VHT LTN 



TCAAGCTCTAGTCGGTGATAAAACTATAGCATYCTGGCTGATGGAC 
~* H ' I I — ■ ■ I 

AGTTC.GAGATCAGCCACTATTTTGATATCGTARGACCGACTACCTG 
QALVGDKTIA 7WLMD 



^•9.9 
'Sheet 
8 



ATTAATAGATCGTGGGATAGCATTGCACAAGATGATTAGGCTTGTA 
' 1 1 1 1 H —i 1 

TAATTATCTAGCACCCTATCGTAACGTGTTCTACTAATCCGAACAT 
LID.RGIALHKM.IRLV. 



Fig. 9 SHEET 7 



SUBSTITUTE SHEET (RULE 26) 



WO 96/34968 



PCT/GB96/01075 



43/75 



GATGCAATTACCATTGGTGAAGATGTTAGCGGAATGCCGACATT 

I »— — 1— — i 1 1620 

CTACGTTAATGGTAACCACTTCTACAATCGCCTTACGGCTGTAA 
DAI T. I G E D V S G M P T F 

/side I 

CATATGGCAATTGCTGATAAATGGATTGAGTTGCTCAAGAAACG 

1 1 1 ' 1 I 1 ' ' H — ' hh — — j 1— i 1710 

GTATACCGTTAACGACTATTTACCTAACTCAACGAGTTCTTTGC 

H M A I A D K W I ELLKKR 

AGAAGATGG TCGGAAAAGTGTGTTTCATMCGCTGAAAGTCATGA 

"~ I 1 — ' I ' 1 1 — ' 1 1 I 1 ^ H— — --h h- 1800 

TCTTCTACCAGCCTTTTCACACAAAGTAKGCGACTTTCAGTACT 
R R W S E K C V S ? A E S H D 

JHinc II 

AAGGATATGTAT GATTTTATGGCTCTGGATAGACCGTCAACATC 

'' 1 ' ' ' 1 ' 11 1 1 1 ' I I i i . i i i i , | i sgn 

TTCCTATACATACTAAAATACCGAGACCTATCTGGCAGTTGTAG 
KDMYD FMA LD RP STS 

Asp 718 

I ^pn I 

ACTATGG GATTAGGAGGAGAAGGGTACCTAAATTTCATGGGAAA 

11 ' 1 I | ■ i i ■ i , . , i | | 1980 

TGATACCCTAATCCTCCTCTTCCCATGGATTTAAAGTACCCTTT 
TMG L GGEGYLNFMGN 
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.EcoRI 

TGAATTCGGCCACCCTGAGTGGATTGATTTCCCTAGGGCTGARCA A- 

ACTTAAGCCGGTGGGACTCACCTAACTAAAGGGAT CCCGACTYGTT 
E F G 'H P E W I 0 ' F P R A £ Q 

; Ssp I 

tgataaatgcagacggagatttgacctgggagatgcagaat 'attta 

'ACTATTTACGTGTGCCTCTAAACTGGACCCTCTACGTCTTATAAAT 
0 K c R R R F D L G D A E Y L 

TGAAGATAAATATGAGTTTATGACTTCAGAACACCAGTTCAT ATCA 

ACTTCTATTTATACTCAAATACTGAAGTCTTGTGGTCAAGTATAGT 
E D K Y E F M T S E H Q F I S 

CCTAGTTTTTGTCTTTAATTTTCACTGGACAAATAGCTATTCAGA C 

GGATCAAAAACAGAAATTAAAAGTGACCTGTTTATCGATAAGTCTG 
LVFv l r NFHWTNS YS D 



GGACTCAGATGATCCACTTTTTGGTGGCTTCGGGAGAAT-FGATCAT 

CCTGAGTCTACTAGGTGAAAAACCACCGAAGCGCTCTTAACTAGTA 
D S D D P L F Q G F G R I D H 

YCGYYCAATTATGGTGTATGCACCTAGTAGAACAGCAGTGGTC TAT 

RGCRRGTTAA.TACCACATACGTGGATCATCTTGTCGTCACCAGATA 
R 7 1 M V Y A P S R T A V V Y 



NGAAGAATTTT 



TTTT f 
—+*■ 2531 I 
NCTTCTTAAAA J 
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CACCTCTCTGATG GCTCAGTAATTCCCGGAAACCAATTCAGTTA 

1 ' ' ' I ' 1 1 1 ' i i . . i I omn 

GTGGAGAGACTACCGAGTCATTAAGGGCCTTTGGTTAAGTCAAT 
H L S D G S V I P G N Q F S Y 

.Mcol 

AGATACCATGGGTTGCAAGAATTTGACCGGGCTATGCAGTATCT 

TCTATGGTACCCAACGTTCTTAAACTGGCCCGATACGTCATAGA 2 1 6 ° 
RYHGLQ EFDRAMQ YL 

CGAAAGGATGAAGGAGATAGGATGATTGTATTTGAAARAGGA AA 

GCTTTCCTACTTCCTCTATCCTACTAACATAAACTTTYTCCTTT 225 ° 
RKDEGDRMI VFE7GN 

TATCGCATAGGCTGCCTGAAGCCTGGAAAA7ACAAGGTTGGC TT 

ATAGCGTATCCGACGGAC.TTCGGACCTTTTATGTTCCAACCGAA 23<>0 
Y R I G C L K P G K Y K V G L 

; Sspl 

AATGCCGAATATTTCACCTCTGAAGGATCGTATGATGATCGYC C 

TTACGGCTTATAAAGTGGAGACTTCCTAGCATACTACTAGCRGG 
NAE Y FTSE GS YDDRP 



GCACTAGTAG ACAAANTAGAAGNAGAAGAAGAAGAAGAANCCGN 

— 11 ' 1 I ■ i | i i i -+ peon 

CGTGATCATCTGTTTNATCTTCNTCTTCTTCTTCTTCTTNGGCN 
ALVDK7 E7EEE EE?? 
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69 
70 
71 
7 

1 



278 

280 

280 

57 

50 



10 20 30 

fl-GATGGG@CCTTGAACTCAGCAATTTGACACTCAGT 
TRGATGGG-CCTTGAACTCAGCAATTTGACACTCAGT 
TOGATGGGr" 




80 



90 



100 



tttttctcttaattccaaccaagg-aatgaataaaaB 

tttttctcttaattccaacca@ggBaatgaataaaag 
tttttctcttaattccaaccaagg -aatgaataaaar 

IaaBag 




150 160 170 

138 gaaagatggtgtatacactctctggagttcgttttcc 

52 ^AAAGATGGTGTATAflACTCTCTGGAGTTCGTTTTCC 
™ GAAAGATGGTGTATACACTCTCTGGAGTTCGT TTTCC 




220 230 240 

208 CAGCAGTAATGGTGATCGGAGGAATGCTAAT0TTTCT 

2 0 CAGCAGTAATGGTGATCGGAGGAATGCTAATGTTTCT 

210 CA GCAGTAATGGTGATCGGAGGAATGCTAA TGTTTHT 

48 CAP" ~~ ~ ~ 

1 ^^^^^g^^^^^^G@ATGCTAATGTTTCT 



290 300 310 

ATCTTGGCTGAAAAGTCTTCTTACAATTCCGAATHcC 
ATCTTGGCTGAAAAGTCTTCTTACAATTCCGAATTCC 
ATCTTGGCTGAAAAGtCTTCTTACAATTCCGAATTCC 
ATCTTGGCTGAAAAGTCTTCTTACAATTCCGAATTCC 

atcttggctgaaaagtcttcttacaattccgaatHcc 
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40 50 60 70 

TAGTTACACT@c£aTCACTTATCAGATCTCTAT 10con. seq 
TAGTTACACTCCTATCACTTATCAGATCTCTAT 11 con. seq 
TAGTTACACTCCT ATCA CTTATCAGATCTCTAT 19con. seq 

^^^~^ A1 ^p^~ — — - - - 





86C0N. SEQ 
pcrsbe2con. seq 



110 

X 



"l20" 



130 
x 



140 
x 



GATAGATTTGTAAAAACCCTAAGGAGAGAAGAA 10con. seq 
GATAGATTTGTAAAAACCCTAAGGAGAGAAGAA 1 Icon, seq 
GATAGATT TGTAAA AACCCTAAGGAG AGAAGAA 19con. seq 
GAEaHATTB BAACIBtBAGIEGAB H 86CON. seq 

pcrsbe2con. seq 





~180~ 
x 



190~ 
x 



200 

X 



"2I0 

X 



TACTGTTCCATCAGTGTACAAATGTAATGGATT lOcon. seq 
TACTGTTCCATCAGTGTACAAATCTAATGGATT 11con. seq 
TACTGTTCCATCAGTGTACAAATCTAATGGATT " 




19con. seq 
86C0N. SEQ 
pcrsbe2con. seq 



250 



260 270 280 

GTATTCTTGAAAAA0CACTCTCTTTCACGGAAG lOcon. seq 
GTATTCTTGAAAAAGCACTCTCTTTCACGGAAG 1 Icon, seq 
GTATTCTTGAAAAAGCACTCTCTTT CACGGAAG 19con. seq 

B&SBSBS3BXBBBBca0ggEBg 86con, seq 

GTATTCTTGAAAAAGCACTCTCTTTCACGGAAG pcrsbe2con. seq 



320 
x 



330 



— 1 — 

340 
x 



350 
x 



GACCTTCTACAQTTGCAGCATCGGGGAAAGTCC 10con. seq 
GACCTTCTACAGTTGCAGCATCGGGGAAAGTCC 11con. seq 
GACCTTCTACAGTTGCAGCATCGGGGAAAGTCC 19con. seq 
GACCTTCTACAGTTGCAGCATCGGGGAAAGTCC 86C0N. SEQ 
GACCTTCTACAGTTGCAGCATCGGGGAAAGTCC pcrsbe2con. seq 
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360 



aSjc" 



370 



380 



348 ttgtgcctggaaflccagagtgatagctcctcatcctc 

350 ttgtgcctggaacccagagtgatagctcctcatcctc 

350 ttgtgcctggaacccagagtgatagctcctcatcctc 

127 ttgtgcctggaacccagagtgatagctcctcatcctc 

1 20 ttgtgc.ctggaaDccagagtgatagctcctcatcctc 



430 

JL 



44o" 
x 



450 



4 1 8 AGAAAATTCCCCAGCATCAACTGATGTAGATAGTTCA 

420 AGAAAATTCCCCAGCATCAACTGATGTAGATAGTTCA 

420. AGAAAATTCCCCAGCATCAACTGATGTAGATAGTTCA 

197 AGAAAATTCCCCAGCATCAACTGATGTAGATAGTTCA 

1 90 AGAAAATTCCCCAGC ATCAACTGATGTAGATAGTTCA 



500 



510 



~52o" 



488 AACGATGACGTTGAGCCGTCAAGTGATCTTACAGGAA 

490 AACGATGACGTTGAGCCGTCAAGTGATCTTACAGGAA 

490 AACGATGACGTTGAGCCGTCAAGTGATCTTACAGGAA 

267 AACGATGACGTTGAGCCGTCAAGTGATCTTACAGGAA 

260 AACGATGACGTTGAGCCGTCAAGTGATCTTACAGGAA 

" 1 1 1 

■ 570 580 590 

55.8 AACTACAAGAAGGTGGTAAACTGGAGGAGTCTAAAAC 

560 AACTACAAGAAGGTGGTAAACTGGAGGAGTCTAAAAC 

560 AACTACAAGAAGGTGGTAAACTGGAGGAGTCTAAAAC 

337 AACTACAAGAAGGTGGTAAACTGGAGGAGTCTAAAAC 

330 AACTACAAGAAGGTGGTAAACTGGAGGAGTCTAAAAC 



640 650 660 

628 atctgataggatcagagagaggggcatccctccacct 

630 atctgataggatcagagagaggggcatccctccacct 

630 atctgataggatcagagagaggggcatccctccacct 

407 atctgataggatcagagagaggggcatccctccacct 

400 atctgataggatcagagagaggggcatccctccacctJ 
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— i r— — 1 — r 

390 400 410 420 

AACAGADCAATTTGAGTTCBCTGAGACATCTCC lOcon. seq 
AACAGACCAATTTG.AGTTCACTGAGACATCTCC -11 con. seq 
AACAGACCAATTTGAGTTCACTGAGACATCTCC 19con. seq 
AACA0ACCAATTTGAGTTCACTGAGACATCTCC 86C0N. SEQ 
AACAGACCAATTTGAGTTCACTGAGACATCTCC pcrsbe2con. seq 

" — i 1 r 

460 470 480 4 90 

ACAATGGAACACGCTAGCCAGATTAAAACTGAG lOcon. seq 
ACAATGGAACACGCTAGCCAGATTAAAACTGAG 1 icon, sea 
ACAATGGAACACGCTAGCCAGATTAAAACTGAG 19con. seq 
ACAATGGAACACGCTAGCCAGATTAAAACTGAG 86G0N. SEQ 
ACAATGGAACACGCTAGCCAGATTAAAACTGAG pcrsbe2con. seq 

— t 1 — i r 

530 540 550 560 

GTGTTGAAGAGCTGGATTTTGCTTCATCACTAC lOcon. seq 
GTGTTGAAGAGCTGGATTTTGCTTCATCACTAC 1 Icon, seq 
GTGTTGAAGAGCTGGATTTTGCTTCATCACTAC 19con. seq 
GTGTTGAAGAGCTGGATTTTGCTTCATCACTAC -86C0N. SEQ 
GTGTTGAAGAGCTGGATTTTGCTTCATCACTAC pcrsbe2con. seq 

69O 610 620 63 0 

ATTAAATACTTCTGAAGAGACAATTATTGATGA 10con. sea 
ATTAAATACTTCTGAAGAGACAATTATTGATGA 1 Icon, seq 
ATTAAATACTTCTGAAGAGACAATTATTGATGA 19con. seS 
ATTAAATACTTCTGAAGAGACAATTATTGATGA 86C0N. SEQ 
ATTAAATACTTCTGAAGAGACAATTATTGATGA pcrsbe2con. seq 

1 1 1 r 

670 680 690 700 

GGACTTGGTCAGAAGATTTATGAAATAGACCCC 10con. seq 

GGACTTGGTCAGAAGATTTATGAAATAGACCCC 1 icon, sea 

GGACTTGGTCAGAAGATTTATGAAATAGACCCC 19con. seq 

GGACTTGGTCAGAAGATTTATGAAATAGACCCC 86C0N. SEQ 

GGACTTGGTCAGAAGATTTATGAAATAGACCCC pcrsbe2con seq 
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698 
700 
700 
477 
470 



768 
770 
770 
547 
540 



838 
839 
840 
617 
610 



908 
909 
910 
687 
680 



978 
979 
980 
757 
750 



Tio" 



720 



730 



CTTTTGACAAACTATCGTCAACACCTTGATTACAGGT 
CTTTTGACAAACTATCGTCAACACCTTGATTACAGGT 
CTTTTGACAAACTATCGTCAACACCTTGATTACAGGT 
CTTTTGACAAACTATCGTCAACACCTTGATTACAGGT 
CTTTTGACAAACTATCGTCAACACCTTGATTACAGGT 



780 



790 



800 



ACAAGTATGAGGGTGGTTTGGAAGCTTTTTCTCGTGG 
ACAAGTATGAGGGTGGTTTGGAAGCHTTTTCTCGTGG 
ACAAGTATGAGGGTGGTTTGGAAGCHTTTTCTCGTGG 
ACAAGTATGAGGGTGGTTTGGAAGCTTTTTCTCGTGG 
ACAAGTATGAGGGTGGTTTGGAAGCTTTf TCTCGTGG 



850 



860 



870 



aggtatcacttaccgtgagtgggctcctggtgcccag 
aggtatcacttaccgtgagtgggctcctggtgcccag 
aggtatcacttaccgtgagtgggctcDtggtgcccag 
aggtatcacttaccgtgagtgggctcctggtgcccag 
aggtatcacttaccgtgagtgggctcctggtgcccag 



920 930 940 

GACGCAAATGCTGACflTTATGACTCGGAATGAATTTG 
GACGCAAATGCTGACATTATGACTCGGAATGAATTTG 
GACGCAAATGCTGACATTATGACTCGGAATGAATTTG 
GACGCAAATGCTGACATTATGACTCGGAATGAATTTG 
GACGCAAATGCTGACATTATGACTCGGAATGAATTTG 



990 

-L 



1000 



1010 

_L 



ATGGTTCTCCTGCAATTCCTCATGGGTCCAGAGTGAA 
ATGGTTCTCCTGCAATTCCTCATGGGTCCAGAGTGAA 
ATGGTTCTCCTGCAATTCCTCATGGGTCCAGAGTGAA 
ATGGTTCTCCTGCAATTCCTCATGGGTCCAGAGTGAA 
ATGGTTCTCCTGCAATTCCTCATGGGTCCAGAGTGAAJ 
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740 750 760 770 

ATTCACAGTACAAGAAACTGAGGGAGGCAATTG lOcon. seq 

ATTCACAGTACAAGAAACTGAGGGAGGCAATTG 1 Icon, seq 

ATTCACAGTACAAGAAACTGAGGGAGGCAATTG 19con. seq 

ATTCAGAGTACAAGAAACTGAGGGAGGCAATTG 86C0N. SEQ 

ATTCACAGTACAAGAAACTGAGGGAGGCAATTG pcrsbe2con. seq 

1 1 : : 1 — — r 

810 820 830 840 

TTATGAAA@AATGGGTTTCACTCGTAGTGCTAC 10con. seq 

TTATGAAAAAATGGGTTTCACTCGTAGTGCTAC 11con. seq 

TTATGAAAAAATGGGTTTCACTCGTAGTGCTAC 19con. seq 

TTATGAAAAAATGGGTTTCACTCGTAGTGCTAC 86C0N. SEQ 

TTATGAAAAAATGGGTTTCACTCGTAGTGCTAC pcrsbe2con. seq 

880 890 900 910 

TCAGCTGCCCTCATTGGgGATTTCAACAATTGG 10con. seq 
TCAGCTGCCCTCATTGGAGATTTCAACAATTGG 11 con. seq 
TCAGCTGCCCTCATTGGAGATTTCAACAATTGG 19con. seq 
TCAGCTGCCCTCATTGGAGATTTCAACAATTGG 86C0N. SEQ 
TCAGCTGCCCTCATTGGAGATTTCAACAATTGG pcrsbe2con. seq 

950 960 970 980 

GTGTCTG0GAGATTTTTCTGC.CAAATAATGTGG lOcon. seq 

GTGTCTGGGAGATTTTTCTGCCAAATAATGTGG 1 Icon, seq 

GTGTCTGGGAGATTTTTCTGCCAAATAATGTGG 19con. seq 

GTGTCTGGGAGATTTTTCTGCCAAAT.AATGTGG 86C0N. SEQ 

GTGTCTGGGAGATTTTTCTGCCAAATAATGTGG pcrsbe2con. seq 

.1020 1030 1040 1050 

GATACGTATGGACACTCCATCAGGTGTTAAGGA lOcon. seq 

GATACGTATGGACACTCCATCAGGTGTTAAGGA 1 Icon, seq 

GATACGTATGGACACTCCATCAGGTGTTAAGGA 19con. seq 

GATACGTATGGACAGTCCATCAGGTGTTAAGGA 86C0N. SEQ 

GATACGSiATGGACACTCCATCAGGTGTTAAGGA pcrsbe2con. seq 
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1060 



1070 



1080 



1048 TTCCATTCCTGCTTGGATCAACTACTCTTTACAGCTT 

1049 TTCCATTCCTGCTTGGATCAACTACTCTTTACAGCTT 

1050 TTCCATTCCTGCTTGGATCAACTACTCTTTACAGCTT 
827 TTCCATTCCTGCTTGGATCAACTACTCfiflTACAGCTT 
820 TTCCATTCCTGCTTGGATCAACTACTCTTTACAGCTT 



1 130 



1 140 



1 150 



1 1 18 GATCCACCCGAAGAGGAGAGGTATATCTTCCAACACC 

1 1 19 GATCCACCCGAAGAGGAGAGGTATATCTTCCAACACC 

1 120 GATCCACCCGAAGAGGAGAGGTATATCTTCCAACACC 
895 GATCCACCCGAAGAGGAGAGGTATATCTTCCAACACC 
890 GATCCACCCGAAGAGGAGAGGTAT0TCTT.CCAACACC 



1200 



1210 
x 



1220 



1 188 ATGAATCTCATATTGGAATGAGTAGTCCGGAGCCTAA 

1 189 ATGAATCTCATATTGGAATGAGTAGTCCGGAGCCTAA 

1 190 ATGAATCTCATATTGGAATGAGTAGTCCGGAGCCTAA 
965 ATGAATCTCATATTGGAATGAGTAGTCCGGAGCCTAA 
960 ATGAATCTCATATTGGAATGAGTAGTCCGGAGCCTAA 



1258 
1259 
1260 
1035 
1030 



1328 
1328 
1329 
1 104 
1099 



1270 

JL 



1200 




1 290 ^ 

TCTTCCTeGCATAAAAAAHGCTTGGGTACAATGCGBT 
TCTTCCTCGCATAAAAAA-GCTTGGGTACAATGC.GCT 
TCTTCCTCGCATAAAAAA-GGTTGGGTACAATGCGCT 
TCTTCCTCGCATAAAAAA-GCTTGGGTACAATGCGCT 
TCTTCCTCGCATAAAAAA-@CTTGGGTACAATGCG@T 

, 



1340 



1350 



1360 



tgctagttttggttatcatgtcacaaatttttttgca 
tgctagttttggttatcatgtcacaaatttttttgca 
Bgctagttttggttatcatgtcacaaatttttttgca 
tgctagttttggttatcatgtcacaaatttttttgca 
tgctagttttggttatcatgtcacaaatttttttgcaj 
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,1090 1100 11-10 1 12 0 

CCTGATGAAATTCCATATAATGGAATATATTAT 10con. seq 

CCTGATGAAATTCCATATAATGGAATATATTAT 1 Icon, seq 

CCTGATGAAATTCCATATAATGGAATAHATTAT 19con. seq 

CCTGATGAAATTCCATATAATGGAATATATTAT 86C0N. SEQ 

CCTGATGAAATTCCATATAATGGAATATATTAT pcrsbe2con. seq 

1 160 1170 1180 119 0 

CACGGCCAAAGAAACCAAAGTCG@TGAGAATAT lOcon. seq 
CACGGCCAAAGAAACCAAAGTCGCTGAGAATAT 11 con. seq 
CACGGCCAAAGAAACCAAAGTCGCTGAGAATAT 19con. seq 
CACGGCCAAAGAAACCAAAGTCGCTGAGAATAT 86C0N. SEQ 
CACGGCCAAAGAAACCAAAGTCGCTGAGAATAT pcrsbe2con. seq 

1230 1240 1250 126 0 

AATTAACTCATACGTGAATTTTAGAGATGAAGT 10con. seq 
AATTAACTCATACGTGAATTTTAGAGATGAAGT 1 Icon, seq 
AATTAACTCATACGTGAATTTTAGAGATGAAGT 19con. seq 
AATTAACTCATACGTGAATTTTAGAGATGAAGT 86C0N. SEQ 
AATTAACTCATACGTGAATTTTAGAGATGAAGT pcrsbe2con. seq 

1 1 1 r 

1300 1310 1320 133 0 

GCAAATTATGGCTATTCAAGAGCATTCTTATTA 10con. seq 

GC@AATTATGGCTATTCAAGAGCATTCTTATTA 1 Icon, seq 

GCAAATTATGGCTATTCAAGAGCATTCTTATTA 19con. seq 

GCAAATTATGGCTATTCAAGAGCATTCTTATTA 86C0N. SEQ 

GCAAATTATGGCTATTCAAGAGCATTCTTATTA pcrsbe2con. seq 

1 1 1 r 

1370 1380 1390 1400 

CCAAGCAGCCGTTTTGGAACGCCCGACGACCTT lOcon. seq 
CCAAGCAGCCGTTTTGGAACGCCCGACGACCTT 1 Icon, seq 
CCAAGCAGCCGTTTTGGAACGCCCGACGACCTT 19con. seq 
CCAAGCAGCCGTTTTGGAACGCCCGACGACCTT 86C0N. SEQ 
CCAAGCAGCCGTTTTGGAACGCCCGACGACCTT pcrsbe2con. seq 
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1398 
1398 
1399 
1174 
1 169 



1468 
1468 
1469 
1244 
1239 



1538 
1538 
1539 
1314 
1309 



1410 
x 



1420 



1430 



aagtctttgattgataaagctcatgagctaggaattg 
aagtcttHgattgataaagctcatgagctaggaattg 
aagtctttgattgataaagctcatgagctaggaattg 
aagtctttgattgataaagctcatgagctaggaattg 
aagtctttgattgataaagctcatgagctaggaattg 



1480 
x 



1490 
x 



1500 
x 



CAAATAATACTTTAGATGGACTGAACATGTTTGACGG 
CAAATAATACTTTAGATGGACTGAACATGTTTGACGG 
CAAATAATACTTTAGATGGACTGAACATGTTTGACflG 
CAAATAATACTTTAGATGGACTGAACATGTTTGACGG 
CAAATAATACTTTAGATGGACTGAACATGTTTGACGG 



1550 



1560 
x 



1570 
x 



tggttatcattggatgtgggattdccgcctctttaac 
tggttatcattggatgtgggattBccgcctctttaac 
tggttatcattggatgtgggattcccgcctctttaac 
tggttatcattggatgtgggattcccgcctfltttaac 
tggttatcattggatgtgggattcccgcctctttaac 
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1620 



1630 
x 



1640 
x 



1608 tcaaatgcgagatggtggttggatgagttcaaatttg 
1-607 tcaaatgcgagatggtggttggatgagttcaaatttg 

1609 tcaaatgcgagatggtggttggatgHgttcaaatttg 
1 384 tcaaatgcgagatggtggttggatgagttcaaatttg 
1379 tcaaatgcgagatggtggttggatgagttcaaatttg 

1690 1700 1710 

1678 TGT@TACTCACCACGGATTATGGGTGGGATTCACTGG 
1677 TGTATACTCACCACGGATTATCGGTGGGATTCACTGG 

1679 TGTATAflTCACCACGGATTATCGGTGGGATTCACTGG 
1 454 TGTATACTCACCACGGATTATCGGTGGGATTCACTGG 
1 449 TGTATACTCACCACGGATTATCGGTGGGATTCACTGG 
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1440 

JL 



1450 



1460 

X 



1470 



ttgttctcatggacattgttcacagccatgcat 
ttgttctcatggacatHgttcacagccatgcat 

TTGTTCTCATGGACATTGTTCACAGCCATGCAT 
TTGTTCTCATGGACATTGTTCACAGCCATGCAT 
TTGTTCTCATGGACATTGTTCACAGCCATGCAT pcrsbe2con. seq 



10con. 
1 1 con. 
19con. 
86C0N. 



seq 
seq 
seq 
SEQ 



15 




1520 

X 



1530 



CACQGATAGTTGTTACTTTCACTCTGGAGCTCG 
CACCGATAGTTGTTACTTTCACTCTGGAGCTCG 
CACCGATAGTTGTTACTTTCACTCTGGAGCTCG 
CACCGATAGTTGTTACTTTCACTCTGGAGCTCG 



1540 
■ 

lOcon. 
1 1 con. 
19con. 
86C0N. 



seq 
seq 
seq 
SEQ 



CACEjGATAGTTGTTACTTTCACTCTGGAGCTCG pcrsbe2con. seq 



1580 
j_ 



1590 



1600 

JL 



~\6\0 
x 



TATGGAAACTGGGAGGTACTTAGGTATCTTCTC 
TATGGAAACTGGGAGGTACTTAGGTATCTTCTC 
TATGGAAACTGGGAGGTACTTAGGTATCTTCTC 
TATGGAAACTGGGAGGTACTTAGGTATCTTCTC 
TATGGAAACTGGGAGGTACTTAGGTATCTTCTC pcrsbe2con. seq 



10con. 
1 1 con. 
19con. 
86C0N. 



seq 
seq 
seq 
SEQ 



1650 



1660 



1670 

JL 



T 



1680 



atggatttagatttgatggtgtgacatcaatga 
atggatttagattHgatgg.tgtgacatcaatga 
atggatttagatttgatggtgtgacatcaatga 
atggatttagatttgatggtgtgacatcaatga 
atggatttagatttgatggtgtgacatcaatga 



lOcon. seq 
1 1 con. seq 
19con. seq 
86C0N. SEQ 
pcrsbe2con. 



seq 



1720 
x 



T 



1730 
x 



1740 
x 



1750 
x 



GAACTACGAGGAATACTTT.GGACTCGCAACTGA 
GAACTACGAGGAATACTTTGGACTCGCAACTGA 
GAACTACGAGGAATACTTTGGACTCGCAACTGA 
GAACTACGAGGAATACTTTGGACTCGCAACTGA 
GAACTACGAGGAATACTTTGGACTCGCAACTGA 



10con. seq 
1 Icon, seq 
19con. seq 
86C0N. SEQ 
pcrsbe2con. 



seq 
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1760 



1770 



1780 



1748 TGTGGATGCTGTTGTGTATCTGATGCTGGTCAACGAT 
1747 TGTGGATGCTGTTGTGTATCTGATGCTGGTCAACGAT 

1 749 TGTGGATGCTGTTGTGTATCTGATGCTGGTCAACGAT 
1 524 TGTGGATGCTGTTGTGTATCTGATGCTGGTCAACGAT 
1519 TGTGGATGCTGTTGTGTATCTGATGCTGGTCAACGAT 



1830 • 1840 1850 

1818 ATTGGTGAAGATGTTAGCGGAATGCCGACATTTTGT@ 
1817 ATTGGTGAAGATGTTAGCGGAATGCCGACATTTTGTA 

1819 ATTGGTGAAGATGTTAGCGGAATGCCGACATT.TTGTA 
1 594 ATTGGTGAAGATGTTAGCGGAATGCCGACATTTTGTA 
1 589 ATTGGTGAAGATGTTAGCGGAATGCCGACATTTTGTA 



1900 



1910 



1920 



1 888 ATCGGCTGCATATGGCAATTGCTGATAAATGGATTGA 
1 887 ATCGGCTGCATATGGCAATTGCTGATAAATGGATTGA 

1 889 ATCGGCTGCATATGGCAATTGCTGATAAAHGGATTGA 
1 664 ATCGGCTGCATATGGCAATTGCTGATAAATGGATTGA 
1659 ATCGGCTGCATATGGCAATTGCTGATAAATGGATTGA 
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1958 
1957 
1959 
1734 
1729 



1970 



1980 
_i_ 



1990 



GGGTGATATTGTTCATACACTGACAAATAGAAGATGG 
GGGTGATATTGTTCATACACTGACAAATAGAAGATGG 
GGGTGATATTGTTCATACACTGACAAATAGAAGATGG 
GGGTGATATTGTTCATACACTGACAAATAGAAGATGG 
GGGTGATATTGTTCATACACTGACAAATAGAAGATGG 



2040 

JL 



2050 



2060 



2028 gatcaagctctagtcggtgataaaactatagcattct 
2027 gatcaagctctagtcggtgataaaactatagcattct 

2029 gatcaagctctagtcggtgataaaactatagcattct 
1 804 gatcaagctctagtcggtgataaaactatagcattct 
1799 gatcaagctctagtcggtgataaaactatagcatDct 
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1 1 '■ 1 — r 

1790 1800 1810 182 0 

CTTATTCATGGGCTTTTCCCAGATGCAATTACC 10con. seq 

CTTATTCAT0GGCTTTTCCCAGATGCAATTACC 11con. seq 

CTTATTCATGGGCTTTTCCCAGATGCAATTACC 19con. seq 

CTTATTCATGGGCTTTTCCCAGATGCAATTACC 86C0N. SEQ 

CTTATTCAHGGGCTTTTCCCAGATGCAATTACC pcrsbe2con. seq 

— i 1 1 r 

1860 . 1870 1880 189 0 

TTCCCGTTCAAGATGGGGGTGTTGGCTTTGACT 10con. seq 

TTCCCGTTCAAGATGGGGGTGTTGGCTTTGACT 1 Icon, seq 

TTCCCGTBCAAGABGGGGGTGTTGGCTTTGACT 19con. seq 

TTCCCGTTCAAGATGGGGGTGTTGGCTTTGACT 86C0N. SEQ 

TTCCCGTTCAAGATGGGGGTGTTGGCTTTGACT pcrsbe2con. seq 

1930 1940 1950 196 0 

GTTGCTCAAGAAACGGGATGAGGATTGGAGAGT 10con. seq 

GTTGCTCAAGAAACGGGATGAGGATTGGAGAGT 1 Icon, seq 

GTTGCTCAAGAAACGGGATGAGGATTGGAGAGT 19con. seq 

GTTGCTCAAGAAACGGGATGAGGATTGGAGAGT 86CQN. SEQ 

GTTGCTCAAGAAACGGGATGAGGATTGGAGAGT pcrsbe2con. seq 

2000 2010 20*20 203 0 

TCGGAAAAGTGTGTTTCATACGCTGAAAGTCAT 10con. seq 

TCGGAAAAGTGTGTTTCATACGCTGAAAGTCAT 11con. seq 

TCGGAAAAGTGTGTTTCATACGCTGAAAGTCAT 19con. seq 

TCGGAAAAGTGTGTTTCATACGCTGAAAGTCAT 86C0N. SEQ 

TCGGAAAAGTGTGTTTCATlBCGCTGAAAGTCAT pcrsbe2con. seq 

2070 2080 2090 210 0 

GGCTGATGGACAAGGATATGTATGATTTTATGG 10con. seq 
GGCTGATGGACAAGGATATGTATGATTTTATGG 1 Icon, seq 
GGCTGATGGACAAGGATATGTATGATTTTATGG 19con. seq 
GGCTGATGGACAAGGATATGTATGATTTTATGG 86C0N. SEQ 
GGCTGATGGACAAGGATATGTATGATTTTATGG pcrsbe2con. seq 
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2098 
2097 
2099 
1874 
1869 



2168 
2167 
2169 
1944 
1939 



2110 



4k 2120 



2130 



CTCTGGATAGACCGTCAACATCATTAATAGATXGTGG 
CTCTGGATAGACCG§CAACATCATTAATAGATCGTGG 
CTCTGGATAGACCGTCAACATCATTAATAGATCGTGG 

ctctggatagaccggcaacatcattaatagatcgtgg 
ctctggatagaccgdcaacaHcattaatagatcgtgg 



2180 



2190 



2200 

X 



TATGGGATTAGGAGGAGAAGGGTACCTAAATTTCATG 
TATGGGATTAGGAGGAGAAGGGTAGCTAAATTTCATG 
TATGGGATTAGGAGGAGAAGGGTACCTAAATTTCATG 
TATGGGATTAGGAGGAGAAGGGTACCTAAATTTCATG 
TATGGGATTAGGAGGAGAAGGGTACCTAAATTTCATG 



2250 

JL 



2260 
x 



2270 

X 



2238 TTCCCTAGGGCTGAACAACACCTCTCTGATGGC.TCAG 
2237 TTCCCTAGGGCTGA@CiACACCTflTCTGATGGCTCAG 

2239 TTCCCTAGGGCTGAACAACACCTCTCTGATGGCTCAG 
2014 TTCCCTAGGGCTGAACAACACCTCTCTGATG0CTCAG 
2009 TTCCCTAGGGCTGA0CAACACCTCTCTGATGGCTCAG 
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2320 



2330 



2340 



2308 GCAGACGGAGATTTGACCTGGGAGATGCAGAATATTT 
2307 GCAGACGGAGATTTGACCTGGGAGATGCAGAATATTT 

2309 GCAGACGGAGATTTGACCTGGGAGATGCAGAATATTT 
2084 GGAGACGGAGATTTGACCTGGGAGATGCAGAATATTT 
2079 GCAGACGGAGATTTGACCTGGGAGATGCAGAATATTT 



2390 



2400 
x 



2410 



2378 TATGCAGTATCTTGAAGATAAATATGAGTTTATGACT 
2377 TATGCAGTATCTTGAAGATAAATATGAGTTTATGACT 

2379 TATGCAGTATCTTGAAGATAAATATGAGTTTATGACT 
2 1 54 T ATGCAGTATCT.TGAAGATAAATATGAGTTTATGACT 
2149 TATGCAGTATCTTGAAGATAAATATGAGTTTATGACT^ 
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1 1 1 r 

2140 2150 2160 217 0 

GATAGCATTBCACAAGATGATTAGGCTTGTAAt 10con. seq 
GATAGCATTGCACAAGATGATTAGGCTTGTAAC 1 1con, seq 
GATAGCATTGCACAAGATGATTAGGCTTGTAAC 19con. seq 
GATAGCATTGCACAAGATGATTAGGCTTGTAAC 86C0N. SEQ 
GATAGCATTGCACAAGATGATTAGGCTTGTAAC pcrsbe2con. seq 

- — i 1 — : « — — r 

2210 2220 2230 224 0 

GGAAATGAATTCGGCCACCCTGAGTGGATTGAT lOcon. seq 

GGAAATGAATTCGGCCACCCTGAGTGGATTGAT 1 Icon, seq 

GGAAATGAATTCGGCCACCCTGAGTGGATTGAT 19con. seq 

GGAAATGAATTCGGCCACCCTGAGTGGATTGAT 86C0N. SEQ 

GGAAATGAATTCGGCCACCCTGAGTGGATTGAT pcrsbe2con. seq 

— i 1 1 r 

2280 2290 2300 231 0 

TAATTCCCBGAAACCAATTCAGTTATGATAAAT 10con. seq 

TAATTCCCGGAAACCAATTCAGTTATGATAAAT 1 Icon, seq 

TAATgCCCGGAAACCAATTCAGTTATGATAAAT 19con. seq 

TAATTCCCGGAAACCAATTCAGTTATGATAAAT 86C0N. SEQ 

TAATTCCCGGAAACCAATTCAGTTATGATAAAT pcrsbe2con. seq 

, 1 : 1 r 

2350 2360 2370 238 0 

AAGATACCGTGGGTTGCAAGAATTTGACCGGGC 10con. seq 
AAGATACCBTGGGTT0CAAGAATTTGACQGGGC 1 Icon, seq 
AAGATACCGTGGGTTGCAAGAATTTGACCGGHC 19con. seq 
AAGATACCGTGGGTTGCAAGAATTTGACCGGGC 86C0N. SEQ 
AAGATACC0TGGGTTGCAAGAATTTGACCGGGC pcrsbe2con. seq 

— i — 1 : 1 r 

2420 2 430 2440 245 0 

TCAGAACACCAGTTCATATCACGAAAGGATGAA JOcon. seq. 
TCAGAACACCAGTTCATATCACGAAAGGATGAA Icon, seq 
TCAGAACACCAGTTCATATCACGAAAGGATGAA 19con. seq 
TCAGAACACCAGTTCATATCACGAAAGGATGAA 86C0N. SEQ 
TCAGAACACCAGTTCATATCACGAAAGGATGAA pcrsbe2con. seq 
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2448 
2447 
2449 
2224 
2219 



2518 
2517 
2519 
2294 
2289 



2460 



2470 



2480 



GGAGATAGGATGATTGTATTTGAAAAAGGAAACCTAG 
GGAGATAGGATGATTGTATTTGAAA0AGGAAACCTAG 
GGAGATAGGATGATTGTATTTGAAAAAGGAAACCTAG 
GGAGATAGGATGATTGTA TTTGAAAAAGGAAACCTAG 
GGAGATAGGATGATTGTATTTGAAA0AGGAAACCTAG 



* 



2530 

_L 



2540 



2550 



ATTCAGACTATCGCATAGGCTGCCTGAAGCCTGGAAA 
ATTCAGACTATCGCATAGGCTGCCTGAAGCCTGGAAA 
ATTCAGACTATCGCATAGiCTGCCTGAAGCCTGGAAA 
ATTCAGACTATCGCATAGGCTGCCTGAAGCCTGGAAA 
ATTCAGACTATCGCATAGGCTGCCTGAAGCCTGGAAA 



2600 



2610 



2620 



2588 TTTTGGTGGCTTCGGGAGAATTGATCATAATGCCGAA 
2587 TTTTGGTGGCTTCGGGAGAATTGATCATAATGCCGAA 

2589 TTTTGGTGGCTTCGGGAGAATTGATCATAATGCCGAA 
2364 TTTTGGTGGCTTCGGGAGAATTGATCATAATGCCGAA 
2359 TTTTGGTGGCTTCGGGAGAATTGATCATAATGCCGAA 



2658 
2657 
2659 
2434 
2429 



2722 
2722 
2729 
2501 
2499 



2670 



2680 



■j|g69 0 



CCTCGTTCAATTATGGTGTATGCACCTAGTAGAACAG 
CCTflGTTCAATTATGGTGTATGCACCTAGTAGAACAG 
CCTCGTTCAATTATGGTGTATGCACCTDGTA0AACAG 
CCTCGTTCAATTATGGTGTATGCACCTflGTAGAACAG 
CCTCGTTCAATTATGGTGTATGCACCTAGTAGAACAG 

it 



2740 




2750 



2760" 



AAGAAGAAGAAGAAGAAGAAGTAGCAGTAGT 

lAGAAGTAGCAGTAGT 



aagaagaagaagaagaagaagaagaagtagcagBagt 
aagaagaagaagaagaagaag aagaagtagcagtagt 
EagaagaagaagaagaaI 
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—I 1 1 — 

2490 2500 2510 ^ ; 

TTTTTGTCTTTAATTTTCACTGGACAAAAHgCI 10con. seq 

TTTTBGTCTTTAATTTTCACTGGACAAADAGCT 1 Icon, seq 

TTTTTGTCTTTAATTTTCACTGGACAAAAAGCT 19con. seq 

TTTTTGTCTTTAATT.TTCACTGGACAAAAAGCT 86C0N. SEQ 

TTTTTGTCTTTAATTTTCACTGGACAAAflAGCT pcrsbe2con. seq 

1 1 r 



2560 2570 2580 259 0 

ATACAAGGTTGCCTTGGACTCAGATGATCCACT 10con. seq 

ATACAAGGTTGflCTTGGACTCAGATGATCCACT 1 Icon, seq 

ATACAAGGTTGCCTTGGACTCAGATGATCCACT 19con. seq 

ATACAAGGTTGCCTTGGACTCAGATGATCCACT 86C0N. SEQ 

ATACAAGGTTGBCTTGGACTCAGATGATCCACT pcrsbe2con. seq 

— i 1 1 r 

2630 -££640 4^2650 266 0 
TATTTCACCTTTGAAGGATGGTATGATGATCGT 10con. seq 
TATTTCACCTHTGAAGGATBGTATGATGATCGT 11con. seq 
TATTTCACCTTTGAAGGATGGTATGATGATCGT 19con. seq 
TATTTCACCTTTGAAGGATGGTATGATGATCGT 86G0N. SEQ 
TATTTCACCTBJGAAGGATBGTATGATGATCGT pcrsbe2con. seq 

— , — *5 , r 

2700 2710 2720 

CAGTGGTCTATGCACTAGTAGACAAAGH- -B& lOcon. seq 
CAGTGGTCTATGCACTAGTAGACAAAHtHHB 1 Icon, seq 
CAGTGGTCTATGCACTAGTAGACAAAGEEBAAG 19con. seq 
CAGTGGTCTATGCACTAGTAGACAAAiGEFIAAG 86C0N. SEQ 
CAGTGGTCTATGCACTAGTAGACAAAElT^AAG pcrsbe2con. seq 

1 1 ; — -i r 

2770 2780 2790 280 0 

AGAAGAA GTAG TAG TAGAAG AAGAATGAACGAA lOcon. seq 
AGAAGAA EEEB TQGflBXBBAAGAATGAACGAA 1 Icon, seq 
AGAAGAAGTAGTAGTAGAAGAAGAATGAACGAA 19con. seq 
AGAAGAAGTAGJ^GJAGAAGAAGAATGAACGAA 86C0N. SEQ 

GlSCHGAAGAATBBB pcrsbe2con. seq 
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2810 



2820 



2830 



2786 CTTGTGATCGCGTTGAAAGATTTGAACGCHACATAGA 

2764 cttgtgatcgcgttgaaagatttgaacgBtacBt@gB 
2799 ctt6tg atcgcgttgaaagatttgaacgctacataga 

2571 cttgtgP*™* 5 **" 8 ^^ 

2529 




2880 



2890 



2900 



2856 CTTGGCGGAATTTCATGTGACAACA-GGTTTGCAATT 
2829 CTTGGCGGAATT@CATGTGACAACA0GGTTTGCABTT 

2869 cttggcggaatttcatgtgacaBBa-ggtttg caatt 

2576 ' 
2529 




2950 2960 2970 

2925 GAGATGAAGTGCTGAACAAA^CATATGTAAAAT.CGA 
2899 GAGATGAAGTGCTGAACAAA- - CATATGTAAAATCGA 

2938 GAGATGAAGTGCTGAACAAA- -CATATGTAAAATCGA 
2576 "~ 

2529 




i 1 — 

3020 3030 

2995 CCTGCAG-- ' cC 

2967 CCTGCA G- q 

3006 CCTGCAG ^^^^^WgiEraii cT 

2529 siwrai- ---------------- -BT; 
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T 



2870 



2840 2850 2860 

GCTTCTTGACGTA TCTG GC AATA TTGCATflAGT 10con. seq 
SBTfflTfflAC0TABSB3GC|iHlT^CATCAGT 1 Icon, seq 
GCTTCTTGACGTAT CTGGCAATATTGCATCAGT 19con. seq 

86C0N. SEQ 
pcrsb.e2con. seq 




2910 



2920 



2930 



2940 



CTTTCCACTATTAGTAGTGCAACGATATACGCA lOcon. seq 
CTTTCCACTATTAGTAGTHCABCGATATACGCA 11 con. seq 
CTTTCCACTATTAGTAGTGCAACGAT ATACGCA 19con. seq 

86C0N. SEQ 
pcrsbe2con. seq 




2980 



2990 



3000 



^3010 



TGAATTTATGTCGAATGCTGGGACGATCGAATT 10con. seq 
TGAATTTATGTCGAATGCTGGGACGATCGAATT llcon. seq 
TGAATTTATGTCGAATGCTGGGACGATCGAATT 19con. seq 

" 86C0N. SEQ 
pcrsbe2con. seq 





10con. seq 
1 Icon, seq 
19con. seq 
86C0N. SEQ 
pcrsbe2con. seq 
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